Upload
satoshi-tagomori
View
543
Download
2
Embed Size (px)
Citation preview
FLUENTD 101 BOOTSTRAP OF UNIFIED LOGGING
Open Source Summit Japan 2017 Fluentd Mini Summit / June 1, 2017
Satoshi Tagomori (@tagomoris) Treasure Data, Inc.
Satoshi "Moris" Tagomori (@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
What is Fluentd?
"Fluentd is an open source data collector for unified logging layer."
https://www.fluentd.org/
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by providing a unified logging layer in between."
SQL
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by providing a unified logging layer in between."
SQL
Unified Logging
Layer
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by providing a unified logging layer in between."
SQL
"Unified Logging Layer" ?
"Fluentd decouples data sources from backend systems by providing a unified logging layer in between."
SQL
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
Simple Core w/ Plugin System+
Various Plugins
Buffering, Retries, Failover
# logs from a file<source> @type tail path /var/log/httpd.log pos_file /tmp/pos_file tag web.access <parse> @type apache2 </parse></source>
# logs from client libraries<source> @type forward port 24224</source>
# store logs to ES and HDFS<match web.*> @type copy <store> @type elasticsearch logstash_format true </store> <store> @type webhdfs host namenode.local port 50070 path /path/on/hdfs <format> @type json </format> </store></match>
Configuration (v0.14)
Implementation / Performance
• Fluentd is written in Ruby
• C extension libraries for performance requirement
• cool.io: Asynchronous I/O
• msgpack: Serialization/Deserialization for MessagePack
• Plugins in Ruby - easy to join the community
• Scaling for CPU cores
• multiprocess plugin (~ v0.12)
• multi process workers (v0.14 ~)
Package / Deployment
• Fluentd released on RubyGems.org
• rpm/deb package
• td-agent by Treasure Data
• Fluentd + some widely-used plugins
• td-agent2: Fluentd v0.12
• td-agent3 (beta now): Fluentd v0.14 (or v1)
• + msi package for Windows
• Docker images
• https://hub.docker.com/r/fluent/fluentd/
For Users in the Enterprise Sector
More security features and some others
https://fluentd.treasuredata.com/
Plugin System
3rd party input plugins
dstat
df AMQL
munin
jvmwatcher
SQL
3rd party output plugins
Graphite
Buffer OutputParserInput FormatterFilter
“output-ish”“input-ish”
“output-ish”“input-ish”Storage
Helper
Buffer OutputParserInput FormatterFilter
Fluentd v0.14
Fluentd v0.12
Plugins
• Built-in plugins
• tail, forward, file, exec, exec_filter, copy, ...
• 3rd party plugins (755 plugins at May 19)
• fluent-plugin-xxx via rubygems.org
• webhdfs, kafka, elasticsearch, redshift, bigquery, ...
• Plugin script
• .rb script files on /etc/fluent/plugin
Ecosystem
Events
Events: Structured Logs
ec_service.shopping_cart
2017-03-30 16:35:37 +0100
{"container_id": "bfdd5b9....","container_name": "/infallible_mayer","source": "stdout","event": "put an item to cart","item_id": 101,"items": 10,"client": "web"}
tagtimestamp
record
Event: tag, timestamp and record
• Tag
• A dot-separated string
• to show what the event is / where the event from
• Timestamp
• An integer of unix time (~ v0.12)
• A structured timestamp with nano seconds (v0.14 ~)
• Record
• Key-value pairs
Data Source
router
input plugin
read / receiveraw data
eventeventeventevent
parser plugin
parse data into key-valuesparse timestamp from record
add tags
output plugin with buffering
eventeventeventevent
format plugin
buffer plugin
formatted data
Data Destination
write / send
Buffers and Retries
Buffer & Retry for Micro-Try&Error
Retry
Retry
Batch
Stream Error
Retry
Retry
Controlled Recovery from Long OutageBuffer (on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + flow control
queued chunks
Last Resort: Secondary Output
Error
queued chunks
# store logs to ES, or file if it goes down<match web.*> @type elasticsearch logstash_format true <secondary> @type secondary_file path /data/backup/web </secondary></match>
Configuration: Secondary Output (v0.14)
Forwarding Data via Network
Forward Plugin: Forwarding Data via Network• Built-in plugin, TCP port 24224 (in default)
• with heartbeat protocol (default UDP in v0.12, TCP/TLS in v0.14)
• Transfer data via TCP
• From Fluentd to Fluentd
• From logger libraries to Fluentd
• From Fluent-bit to Fluentd
• From Docker logging driver to Fluentd
• Standard protocol
https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1(Spec v0 at Fluentd v0.12 or earlier -> Spec v1 at Fluentd 0.14)
Forward: Features• Load Balancing / High Availability
• Servers with weights
• Standby servers
• DNS round robin
• Controlling Data Transferring Semantics
• at-most-once / at-least-once transferring
• Spec v1 Features
• TLS support
• Simple authentication/authorization
• Efficient data transferring: Gzip Compression (v0.14)
Forward Plugin: Load Balancing
60
60
60
60
60
60
60
20
weight: 60 (default)
weight: 60 (default)
weight: 60 (default)
weight: 20
Balance Data with Configured Weight
Forward Plugin: Handling Server Failure
Detecting Server Down/Up using Heartbeat
Error
Forward Plugin: Standby Server
Detecting Server Down/Up using Heartbeat
standby: true
Error
standby: true
Forward Plugin: Without ACK (at-most-once)
forward outputbuffer
forward input any output
forward output forward input any outputbufferbuffer
Without any troubles
With troubles about buffers in destination
forward outputbuffer
forward input any output
forward output forward input any outputbuffer buffer
Events will be lost in this case :(
Forward Plugin: With ACK (at-least-once)
forward outputbuffer
forward input any output
forward output forward input any outputbuffer
Forward output:require_ack_response: true
forward output forward input any outputbuffer
buffer
buffer ACKwith chunk id
forward output forward input any outputbufferbuffer
with chunk id
Forward Plugin: With ACK (at-least-once)
forward outputbuffer
forward input any output
forward output forward input any output
Forward output:require_ack_response: true
forward output forward input any output
buffer
buffer ACK missing
forward output forward input any output
with chunk id
buffer
buffer
with chunk id
retry
Forward output ensure to transfer buffers to any living destinations :D
Fluentd has many good stuffsfor logging.
Discover more on docs.fluentd.org!
Happy Logging! @tagomoris