‘tail’ in Fluent Bit - Standard Configuration
Background and Overview
‘tail’ is one of the most popular plugins that reads log messages from local files. It allows to monitor one or more text files, sending data to Fluent Bit or Fluentd. In this blog series we are going to cover a use case where the ‘tail’ plugin would be used to obtain data from a log file to send to Fluent Bit. It is pretty common to gather event data from various systems using Fluent Bit, and send them to Fluentd or other applications. Since Fluent Bit is fast and lightweight it makes it easy to collect events from different sources without complexity, and since Fluentd is flexible and highly functional it aggregates from multiple inputs, processes data and routes to different outputs.
In this blog, as a start of a use case, we will try running Fluent Bit using ‘tail’ plugin with a standard configuration file. The two blogs after this one cover advanced configuration such as multiline parsing.
‘tail’ in Fluent Bit - Standard Configuration
↑ This blog will cover this section!
System Environments for this Exercise
The system environment used in the exercise below is as following:
CentOS8
Fluent Bit v2.0.6
VM specs: 2 CPU cores / 2GB memory
We assume that you have downloaded Fluent Bit on your machine, waiting for your first use. If you hadn’t installed Fluent Bit yet, visit this website to get ready.
Standard Configuration Pattern
The scenario described in this lab is based on the following directory structure:
/fluentbit : root directory |--- conf |--- custom_parsers.conf |--- Lab01 |-- (Lab01 configuration files) |-- sample |-- (Sample log files for exercise) |--- log |--- buffer
Here is a sample ‘tail’ configuration. For more detail of the functions visit the official manual website.
Path:
Specify a log file or multiple files through the use of common wild cards. Multiple cards separated by comma are also allowed.Read_from_head:
For new discovered files on start (without a database/offset position), read the content from the head of the file, not tail.DB:
Specify the database file to keep track of monitored files and offsets.
[INPUT] Name tail Tag syslog Path /var/log/messages Read_from_head true DB /fluentbit/tail_syslog.db Storage.type filesystem
In the above sample, both ‘Read_from_head’ and ‘DB’ options are configured.
If you would like to read the existing log messages, ‘Read_from_head’ is recommended. By default, ‘Read_from_head’ is set to ‘false’ which means Fluent Bit does not read the existing log messages in the log files specified by ‘Path’ option and read only the new log messages after discovery. When Fluent Bit gets stuck accidentally and needs to restart, Fluent Bit reads the file from the head again with ‘Read_from_head true’ which causes data duplication. To avoid this, it is recommended to use ‘DB’ option to keep the offset information.
You can learn more details and other options on the following page:
https://docs.fluentbit.io/manual/pipeline/inputs/tail/
Let's move on to the exercise to learn the behavior of the ‘tail’ plugin.
Exercise
One of the common use cases is reading Linux OS Logs. RHEL/CentOS store its logs in: /var/log/messages. In this exercise, we use sample Linux log: ‘sample/sample01_linux_messages.txt.’
Here is a sample Linux log, and a sample configuration that reads the log and shows them in stdout. In the ‘tail’ section, we use both ‘Read_from_head’ and ‘DB’ options.
sample01_linux_messages.txt
Oct 27 16:14:31 fluent01 systemd[1]: Started dnf makecache. Oct 27 16:20:29 fluent01 systemd[1]: Starting system activity accounting tool... Oct 27 16:20:29 fluent01 systemd[1]: Started system activity accounting tool. Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.280967 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service: no such file or directory Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.281027 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/blkio/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/sysstat-collect.service: no such file or directory Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.281048 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/memory/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/sysstat-collect.service: no such file or directory
sample01_flb_tail_standard.conf
Make sure that ‘Path’ specifies the text file shown above.
[SERVICE] ## General settings Flush 5 Log_Level Info Daemon off Log_File /fluentbit/log/fluentbit.log Parsers_File /fluentbit/conf/custom_parsers.conf ## Buffering and Storage Storage.path /fluentbit/buffer/ Storage.sync normal Storage.checksum Off Storage.backlog.mem_limit 5M Storage.metrics On ## Monitoring (if required) HTTP_Server true HTTP_Listen 0.0.0.0 HTTP_Port 2020 Health_Check On HC_Errors_Count 5 HC_Retry_Failure_Count 5 HC_Period 60 [INPUT] Name tail Tag linux.messages Path /fluentbit/conf/Lab01/sample/sample01_linux_messages.txt Storage.type filesystem Read_from_head true DB /fluentbit/tail_linux_messages.db [OUTPUT] Name stdout Match linux.messages
Let's run Fluent Bit and check the behavior!
Run Fluent Bit using this configuration file.
$ fluent-bit -c sample01_flb_tail_standard.conf Fluent Bit v2.0.6 * Copyright (C) 2015-2022 The Fluent Bit Authors * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd * https://fluentbit.io
After a while you can also see the following messages in stdout.
As you can see from the output, the whole log messages are nested under ‘log’ key by default since there is no parser definition in the configuration. Parsing plain text into JSON would be covered in another blog.
Also, timestamp generated by Fluent Bit is not the same as the timestamp in the original messages. The first message, for instance, the timestamp generated by Fluent Bit is
1666908376.230337340, which is equal to October 27, 2022 10:06:16 PM GMT,
which is different from the timestamp in the original message Oct 27 16:14:31. The timestamp adjustment would be covered in another blog.
[0] linux.messages: [1677106496.274148557, {"log"=>"Oct 27 16:14:31 fluent01 systemd[1]: Started dnf makecache."}] [1] linux.messages: [1677106496.274161493, {"log"=>"Oct 27 16:20:29 fluent01 systemd[1]: Starting system activity accounting tool..."}] [2] linux.messages: [1677106496.274162041, {"log"=>"Oct 27 16:20:29 fluent01 systemd[1]: Started system activity accounting tool."}] [3] linux.messages: [1677106496.274162423, {"log"=>"Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.280967 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service: no such file or directory"}] [4] linux.messages: [1677106496.274163003, {"log"=>"Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.281027 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/blkio/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/sysstat-collect.service: no such file or directory"}] [5] linux.messages: [1677106496.274163394, {"log"=>"Oct 27 16:40:29 fluent01 kubelet[896]: W1027 16:40:29.281048 896 watcher.go:95] Error while processing event ("/sys/fs/cgroup/memory/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/sysstat-collect.service: no such file or directory"}]
DB file is created at /fluentbit/tail_linux_messages.db. You can check inside the db file by the following steps just in case.
File ID (generated by tail plugin): 1
File name: /fluentbit/conf/Lab01/sample/sample01_linux_messages.txt
Offset: 1172
inode: 2322445
Created time: 1677106496
$ sqlite3 /fluentbit/tail_linux_messages.db SQLite version 3.31.1 2020-01-27 19:55:54 Enter ".help" for usage hints. sqlite> SELECT * FROM in_tail_files; 1|/fluentbit/conf/Lab01/sample/sample01_linux_messages.txt|1172|2322445|1677106496|0
To end the sqlite3 command, press “ctrl” + “d”
Then let's stop Fluent Bit and start again.
In the first run Fluent Bit already read 5 lines in the log file and the offset information is stored in the DB file. Our expectation is that Fluent Bit would not read those 5 lines again.
$ fluent-bit -c sample01_flb_tail_standard.conf Fluent Bit v2.0.6 * Copyright (C) 2015-2022 The Fluent Bit Authors * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd * https://fluentbit.io
If everything is fine, no messages are shown in stdout. This means that Fluent Bit restarted the second run but did not read process the same lines again, since the DB file had information on where to restart from.
Great, you finished the first exercise! You figured out how to use ‘tail’ plugin in Fluent Bit to gather data from a typical log file. If you would like to continue to the next exercise, check out our next blog post: Parsing in Fluent Bit using Regular Expression
Need more help? - We are here for you.
In the Fluentd Subscription Network, we will provide you consultancy and professional services to help you run Fluentd and Fluent Bit with confidence by solving your pains. Service desk is also available for your operation and the team is equipped with the Diagtool and the knowledge of tips running Fluent Bit/Fluentd in production. Contact us anytime if you would like to learn more about our service offerings.