Logging : Fluentd with Kubernetes
Kubernetes in Production
More people use Kubernetes in production today as you can find more from the CNCF survey conducted earlier 2020.
The use of containers in production has increased to 92%, up from 84% last year, and up 300% from
our first survey in 2016.
Kubernetes use in production has increased to 83%, up from 78% last year.
55% of respondents use stateful applications in containers in production.
Monitoring - Tracing, Time Series & Logging
People talk more about “monitoring” by the context of tracing like distributed tracing from the application perspective as well as monitoring time series data technologies. However, the fundamental piece of of application and infrastructure is “logging” and this comes to a different story when you transform from traditional bare metal and VM based system’s architecture. This is also captured in the CNCF survey and 22% of responders answered “logging” is a challenge in using/deploying containers.
Logging Challenges
For stable operation of Kubernetes, you need to capture events happening in running applications by gathering log information from all applications. Since applications running on Kubernetes are based on Docker containers, there are considerations for logging :
Log information is stored locally in the host OS without associated Kubernetes metadata, such as namespace, pod name and label name.
Log information is deleted when containers are terminated.
Kubernetes itself delivers native functionality to capture log messages by performing “kubectl logs” commands but it does not work in highly scaled and distributed environments. That’s where Fluentd comes in. Fluentd helps you to centralize log information of running applications with Kubernetes metadata and route the information to desired destinations such as ElasticSearch or AWS S3.
In this post, I will share how Fluentd works with example Kubernetes and EFK(ElasticSearch/Fluentd/Kibana) stack configuration.
You can learn more about logging concept of Kubernetes in Logging Architecture.
How Fluentd works with Kubernetes
Fluentd provides “Fluentd DaemonSet“ which enables you to collect log information from containerized applications easily. With DaemonSet, you can ensure that all (or some) nodes run a copy of a pod. Fluentd provides “fluent-plugin-kubernetes_metadata_filter“ plugins which enriches pod log information by adding records with Kubernetes metadata. With that, you can identify where log information comes from and filter information easily with tagged records.
You can learn more about Fluentd DaemonSet in Fluentd Doc - Kubernetes.
“Fluentd DaemonSet“ also delivers pre-configured container images for major logging backend such as ElasticSearch, Kafka and AWS S3. You can find available Fluentd DaemonSet container images and sample configuration files for deployment in Fluentd DaemonSet for Kubernetes.
Logging with EFK Stack
Let’s see how Fluentd works in Kubernetes in example use case with EFK stack. In this example, I deployed nginx pods and services and reviewed how log messages are treated by Fluentd and visualized using ElasticSearch and Kibana. With this example, you can learn Fluentd behavior in Kubernetes logging and how to get started.
Overview of example use case is described in following image.
In this post, I use external ElasticSearch and Kibana with SSL enabled configuration .
Deploying nginx pods and services
I deployed nginx pods and services with steps described in Connecting Applications with Services.
1. Create “my-nginx” pods with replications.
2. Create “my-nginx“ service.
3. Get endpoints of “my-nginx” service.
4. Perform HTTP request to “my-nginx“ endpoints and you can check access log information in “kubectl logs“ output.
Kubertentes metadata, such as namespace and container images, is not included in “kubectl logs“ output.
You can also find log information under “/var/log/pods/{pod name}“directories in host.
Deploying Fluentd DaemonSet
1. Clone Git repository of Fluentd DaemonSet.
2. Create “ServiceAccount” and “ClusterRole” for Fluentd DaemonSet.
3. Create Fluentd DaemonSet with ElasticSearch configuration.
In this post, I use configuration of external ElasticSearch cluster.
FLUENT_ELASTICSEARCH_HOST : "elastic01.demo.local"
FLUENT_ELASTICSEARCH_PORT : “9200“
FLUENT_ELASTICSEARCH_SCHEME : “https”
FLUENT_ELASTICSEARCH_USER : elastic
FLUENT_ELASTICSEARCH_PASSWORD : {password of user ‘elastic’}
FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX : {your custom prefix name, "fluentd.k8sdemo" for instance}
Also, I use following container images for ElasticSearch connection :
image : fluent/fluentd-kubernetes-daemonset:v1.11.5-debian-elasticsearch7-1.1
You can find available environment values in of “fluent/fluentd-kubernetes-daemonset:v1.11.5-debian-elasticsearch7-1.1” image in conf of v1.11.5-debian-elasticsearch7-1.1.
4. Check status of Fluentd DaemonSet.
You can check more details by “kubectl describe“ command for troubleshooting.
Checking messages in Kibana
Once Fluentd DaemonSet become “Running“ status without errors, now you can review logging messages from Kubernetes cluster with Kibana dashboard.
Logging messages are stored in “FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX” index defined in DaemonSet configuration. In this post, I used "fluentd.k8sdemo" as prefix.
Fluentd DaemonSet collect log information from “etcd” and “kube-controller-manager“ as well as custom application pods.
You can see Fluentd DaemonSet enriches log information with Kubernetes metadata.
Commercial Service - We are here for you.
In the Fluentd Subscription Network, we will provide you consultancy and professional services to help you run Fluentd and Fluent Bit with confidence by solving your pains. Service desk is also available for your operation and the team is equipped with the Diagtool and knowledge of tips running Fluentd in production. Contact us anytime if you would like to learn more about our service offerings.