Multiline Parsing Best Practice in Fluent Bit
Background and Overview
It is important to parse multiline log data using Fluent Bit because many log files contain log events that span multiple lines, and parsing these logs correctly can improve the accuracy and usefulness of the data extracted from them. When multiline logs are not properly parsed, it can result in errors, inconsistencies, and incomplete or inaccurate information in the data being extracted.
By accurately parsing multiline logs, users can gain a more comprehensive understanding of their log data, identify patterns and anomalies that may not be apparent with single-line logs, and gain insights into application performance and potential issues. This can help organizations troubleshoot and optimize their applications and infrastructure, improving reliability and reducing downtime.
This blog post would be the third and last section of our Fluent Bit use case as shown below.
Multiline Parsing in Fluent Bit
↑ This blog will cover this section!
System Environments for this Exercise
The system environment used in the exercise below is as following:
CentOS8
Fluent Bit v2.0.6
VM specs: 2 CPU cores / 2GB memory
Exercise
The directory structure would remain the same as the two exercises in our earlier blog posts:
/fluentbit : root directory
|--- conf
|--- custom_parsers.conf
|--- Lab01
|-- (Lab01 configuration files)
|-- sample
|-- (Sample log files for exercise)
|--- log
|--- bufferIn the previous blog post, we parsed log data that had multiple lines in the same format using regular expression (“regex”). Having similar formats in all the lines made it relatively easy to parse the log data.
However in some cases, you would like to merge multiple log lines into a single line. The following Fluentd log file for instance, has stack trace messages from line #3 to #22. These lines should be treated as a single log event to make log message meaningful. This is where ‘Multiline Parsing’ feature comes in.
sample02_multiline.txt (Fluentd log file example)
2022-10-21 23:42:04 +0000 [info]: gem 'fluent-plugin-utmpx' version '0.5.0' 2022-10-21 23:42:04 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0' 2022-10-21 23:42:04 +0000 [warn]: For security reason, setting private_key_passphrase is recommended when cert_path is specified /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:89:in `read': No such file or directory @ rb_sysopen - ./cert/fluent01.key.pem (Errno::ENOENT) from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:89:in `cert_option_load' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:65:in `cert_option_server_validate!' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/server.rb:330:in `configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin/in_forward.rb:102:in `configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin.rb:187:in `configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:320:in `add_source' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:161:in `block in configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:155:in `each' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:155:in `configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/engine.rb:105:in `configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/engine.rb:80:in `run_configure' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/supervisor.rb:668:in `run_supervisor' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/command/fluentd.rb:356:in `<top (required)>' from <internal:/opt/fluentd/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require' from <internal:/opt/fluentd/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require' from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/bin/fluentd:15:in `<top (required)>' from /opt/fluentd/bin/fluentd:25:in `load' from /opt/fluentd/bin/fluentd:25:in `<main>' 2022-10-21 23:42:04 +0000 [info]: gem 'fluent-plugin-splunk-hec' version '1.2.9' 2022-10-21 23:42:04 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
The steps to enable ‘multiline parsing’ feature is almost the same with custom parsing. The first step is to create the custom regex for the first line and also define the parsing rule. Here is the parser rule for the Fluentd log shown earlier:
[PARSER]
Name FLUENTD_LOG
Format regex
Regex /^(?<time>[^ ]* {1,2}[^ ]* [^ ]*)\s+\[(?<level>[\s\w]*)\]\:\s+(?<message>.*)$/
Time_Key time
Time_Format %Y-%m-%d %H:%M:%S
Time_Keep OnThen configure ‘multiline parsing’ settings in the tail section.
‘Multiline’ : ‘On’ to enable multiline parsing feature
‘Parser_Firstline’ : Specify parsing rule for the first line
[INPUT]
Name tail
Tag linux.messages
Path /fluentbit/conf/Lab4/sample/sample02_multiline.txt
Storage.type filesystem
Read_from_head true
#DB /fluentbit/tail_linux_messages.db
Multiline On
Parser_Firstline FLUENTD_LOGThe whole sample of the configuration file is as below:
sample03_flb_tail_multiline_parser.conf
[SERVICE]
## General settings
Flush 5
Log_Level Info
Daemon off
Log_File /fluentbit/log/fluentbit.log
Parsers_File /fluentbit/conf/custom_parsers.conf
## Buffering and Storage
Storage.path /fluentbit/buffer/
Storage.sync normal
Storage.checksum Off
Storage.backlog.mem_limit 5M
Storage.metrics On
## Monitoring (if required)
HTTP_Server true
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
HC_Errors_Count 5
HC_Retry_Failure_Count 5
HC_Period 60
[INPUT]
Name tail
Tag linux.messages
Path /fluentbit/conf/Lab01/sample/sample02_multiline.txt
Storage.type filesystem
Read_from_head true
#DB /fluentbit/tail_linux_messages.db
Multiline On
Parser_Firstline FLUENTD_LOG
[OUTPUT]
Name stdout
Match linux.messagesLet's run Fluent Bit with the sample configuration.
Run Fluent Bit
$ fluent-bit -c sample03_flb_tail_multiline_parser.conf
Check the output
As you can see, the stack traces from line #3 to #22 in the original file were merged into a single event as expected.
[0] linux.messages: [1666395724.000000000, {"time"=>"2022-10-21 23:42:04 +0000", "level"=>"info", "message"=>"gem 'fluent-plugin-utmpx' version '0.5.0'"}]
[1] linux.messages: [1666395724.000000000, {"time"=>"2022-10-21 23:42:04 +0000", "level"=>"info", "message"=>"gem 'fluent-plugin-webhdfs' version '1.5.0'"}]
[2] linux.messages: [1666395724.000000000, {"time"=>"2022-10-21 23:42:04 +0000", "level"=>"warn", "message"=>"For security reason, setting private_key_passphrase is recommended when cert_path is specified
/opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:89:in `read': No such file or directory @ rb_sysopen - ./cert/fluent01.key.pem (Errno::ENOENT)
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:89:in `cert_option_load'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/cert_option.rb:65:in `cert_option_server_validate!'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/server.rb:330:in `configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin/in_forward.rb:102:in `configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/plugin.rb:187:in `configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:320:in `add_source'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:161:in `block in configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:155:in `each'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/root_agent.rb:155:in `configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/engine.rb:105:in `configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/engine.rb:80:in `run_configure'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/supervisor.rb:668:in `run_supervisor'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/lib/fluent/command/fluentd.rb:356:in `<top (required)>'
from <internal:/opt/fluentd/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
from <internal:/opt/fluentd/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
from /opt/fluentd/lib/ruby/gems/3.0.0/gems/fluentd-1.14.6/bin/fluentd:15:in `<top (required)>'
from /opt/fluentd/bin/fluentd:25:in `load'
from /opt/fluentd/bin/fluentd:25:in `<main>'"}]
[3] linux.messages: [1666395724.000000000, {"time"=>"2022-10-21 23:42:04 +0000", "level"=>"info", "message"=>"gem 'fluent-plugin-splunk-hec' version '1.2.9'"}]
[4] linux.messages: [1666395724.000000000, {"time"=>"2022-10-21 23:42:04 +0000", "level"=>"info", "message"=>"gem 'fluent-plugin-systemd' version '1.0.5'% "}]Congratulations! You finished the last exercise of this use case.
In this blog we shared one of the simplest ways to parse log data using Fluent Bit. When ‘Multiline On’ is set in the ‘[INPUT]’ section, just like in this blog, Fluent Bit will apply the same multiline configuration to all logs coming through that input. This means that all logs in that input will be parsed using the same pattern, regardless of their content or format. On the other hand, if you use ‘[MULTILINE_PARSER]’ section to parse your data, which is another option you could use to parse data, you can define multiple parsing rules for different log formats or sources. This allows you to have more fine-grained control over how the logs are parsed, and apply different parsing configurations to different inputs. For example, you can define one parsing rule for Apache logs and another one for Nginx logs, each with its own pattern and configuration.
Want to learn more? - Let’s get in touch.
In the Fluentd Subscription Network, we will provide you consultancy and professional services to help you run Fluentd and Fluent Bit with confidence by solving your pains. Service desk is also available for your operation and the team is equipped with the Diagtool and the knowledge of tips running Fluent Bit/Fluentd in production. Contact us anytime if you would like to learn more about our service offerings.

