Parsing Logs with Regular Expressions using Fluentd
Overview
Fluentd is a powerful tool for log collection and processing. One of its most useful features is the ability to parse logs using regular expressions (regex). This allows you to extract specific information from your logs and structure them in a way that makes them easier to analyze.
In this post, we'll go through some examples of how to use regex with Fluentd to parse logs.
System Environments for this Exercise
The system environment used in the exercise below is as following.
Rocky Linux release 8.6
td-agent 4.5.0 (fluentd 1.16.1)
Basic Regex Parsing
Let's start with a basic example. Suppose you have the following log message.
You can use the following Fluentd configuration to parse this log.
<source>
@type tail
path /var/log/sample.log
pos_file /var/log/td-agent/sample.pos
tag sample-topic
read_from_head true
<parse>
@type regexp
expression /(?<message>\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*)/
</parse>
</source>
<match **>
@type stdout
</match>
The output will look something like this.
2023-07-20 23:47:56.810795577 +0000 sample-topic: {"message":"[2023/07/20 15:30:45] [ info] [input:tail:tail.0] inotify_fs_add(): inode=12345678 watch_fd=1 name=/var/log/example.log"}
This configuration uses the regexp parser to match the log message against the provided regex pattern. The (?<message>...) part of the pattern is a named capture group that extracts the entire log message and assigns it to the message field in the resulting record.
Handling Unmatched Logs
Sometimes your logs may contain messages that don't match your regex pattern. By default, Fluentd will display a warning when this happens. For example, if your logs contain the message "Hello World", Fluentd will issue the following warning.
You can suppress these warnings by adding the emit_invalid_record_to_error false option to your configuration.
Note: This option is not available in the source directive. Please add this option to the filter directive.
<source>
@type tail
path /var/log/sample.log
pos_file /var/log/td-agent/sample.pos
tag sample-topic
read_from_head true
<parse>
@type none
message_key prod
</parse>
</source>
<filter sample-topic*>
@type parser
key_name prod
emit_invalid_record_to_error false
<parse>
@type regexp
expression /(?<message>\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*)/
</parse>
</filter>
<match **>
@type stdout
</match>
Parsing Multiline Logs
Fluentd also supports parsing multiline logs. This is useful when your logs contain messages that span multiple lines. For example, consider the following log message.
[2021/12/07 21:49:04] [ info] Hello from Fluentd !!
You can use the multiline parser to handle this kind of log.
<source>
@type tail
path /var/log/sample.log
pos_file /var/log/td-agent/sample.pos
tag sample_log
read_from_head true
<parse>
@type multiline
format_firstline /\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*/
format1 /^(?<message>.*)/
</parse>
</source>
<match **>
@type stdout
</match>
As you can see, the output will look like this.
2023-07-21 23:16:53.222715003 +0000 sample_log: {"message":"[2021/12/07 21:49:04] [ info] Hello\nfrom\nFluentd\n!!"}
This configuration uses the multiline parser to match the first line of each log message against the format_firstline pattern. It then uses the format1 pattern to extract the entire message, including any additional lines.
Conclusion
Fluentd's regex parsing capabilities make it a powerful tool for processing logs. Whether you're dealing with simple single line messages or complex multiline logs, Fluentd can help you extract the information you need.
Happy logging!
Need more help? - We are here for you.
In the Fluentd Subscription Network, we will provide you consultancy and professional services to help you run Fluentd and Fluent Bit with confidence by solving your pains. Service desk is also available for your operation and the team is equipped with the Diagtool and the knowledge of tips running Fluent Bit/Fluentd in production. Contact us anytime if you would like to learn more about our service offerings.

