Parsing Logs with Regular Expressions using Fluentd
Overview
Fluentd is a powerful tool for log collection and processing. One of its most useful features is the ability to parse logs using regular expressions (regex). This allows you to extract specific information from your logs and structure them in a way that makes them easier to analyze.
In this post, we'll go through some examples of how to use regex with Fluentd to parse logs.
System Environments for this Exercise
The system environment used in the exercise below is as following.
Rocky Linux release 8.6
td-agent 4.5.0 (fluentd 1.16.1)
Basic Regex Parsing
Let's start with a basic example. Suppose you have the following log message.
You can use the following Fluentd configuration to parse this log.
<source> @type tail path /var/log/sample.log pos_file /var/log/td-agent/sample.pos tag sample-topic read_from_head true <parse> @type regexp expression /(?<message>\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*)/ </parse> </source> <match **> @type stdout </match>
The output will look something like this.
2023-07-20 23:47:56.810795577 +0000 sample-topic: {"message":"[2023/07/20 15:30:45] [ info] [input:tail:tail.0] inotify_fs_add(): inode=12345678 watch_fd=1 name=/var/log/example.log"}
This configuration uses the regexp
parser to match the log message against the provided regex pattern. The (?<message>...)
part of the pattern is a named capture group that extracts the entire log message and assigns it to the message
field in the resulting record.
Handling Unmatched Logs
Sometimes your logs may contain messages that don't match your regex pattern. By default, Fluentd will display a warning when this happens. For example, if your logs contain the message "Hello World", Fluentd will issue the following warning.
You can suppress these warnings by adding the emit_invalid_record_to_error false
option to your configuration.
Note: This option is not available in the source
directive. Please add this option to the filter
directive.
<source> @type tail path /var/log/sample.log pos_file /var/log/td-agent/sample.pos tag sample-topic read_from_head true <parse> @type none message_key prod </parse> </source> <filter sample-topic*> @type parser key_name prod emit_invalid_record_to_error false <parse> @type regexp expression /(?<message>\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*)/ </parse> </filter> <match **> @type stdout </match>
Parsing Multiline Logs
Fluentd also supports parsing multiline logs. This is useful when your logs contain messages that span multiple lines. For example, consider the following log message.
[2021/12/07 21:49:04] [ info] Hello from Fluentd !!
You can use the multiline
parser to handle this kind of log.
<source> @type tail path /var/log/sample.log pos_file /var/log/td-agent/sample.pos tag sample_log read_from_head true <parse> @type multiline format_firstline /\[\d{4}\/\d{2}\/\d{2}\s+\d{2}:\d{2}:\d{2}\]\s+\[[\s\w]+\]\s+.*/ format1 /^(?<message>.*)/ </parse> </source> <match **> @type stdout </match>
As you can see, the output will look like this.
2023-07-21 23:16:53.222715003 +0000 sample_log: {"message":"[2021/12/07 21:49:04] [ info] Hello\nfrom\nFluentd\n!!"}
This configuration uses the multiline
parser to match the first line of each log message against the format_firstline
pattern. It then uses the format1
pattern to extract the entire message, including any additional lines.
Conclusion
Fluentd's regex parsing capabilities make it a powerful tool for processing logs. Whether you're dealing with simple single line messages or complex multiline logs, Fluentd can help you extract the information you need.
Happy logging!
Need more help? - We are here for you.
In the Fluentd Subscription Network, we will provide you consultancy and professional services to help you run Fluentd and Fluent Bit with confidence by solving your pains. Service desk is also available for your operation and the team is equipped with the Diagtool and the knowledge of tips running Fluent Bit/Fluentd in production. Contact us anytime if you would like to learn more about our service offerings.