My EKS clusters depend on Fluentd daemonsets to send log messages to ElasticSearch. Docker wraps log messages from containers, line-by-line, in JSON. It also splits log messages into 16kb chunks if they're larger than that. This causes problems when those messages are structured JSON (embedded within Docker's JSON) since they're no longer parseable.
I've tried configuring the fluent-concat-plugin to identify split messages and re-assemble them before sending these to ElasticSearch. Despite my attempts, the messages either remain split, or nothing gets sent to ES.
All my attempts use the following input configuration in their fluentd.conf:
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag raw.containers.*
read_from_head true
<parse>
@type json
time_type string
time_format %Y-%m-%dT%H:%M:%S.%NZ
keep_time_key true
</parse>
</source>
This attempt doesn't concatenate split log messages:
<filter raw.containers.**>
@type concat
key log
use_partial_metadata true
separator ""
</filter>
This causes nothing to appear in ES for any split message.
<filter raw.containers.**>
@type concat
key log
multiline_end_regexp /\\n$/
separator ""
</filter>
This blocks all processing with errors in the fluentd log indicating "logtag" isn't present in the JSON coming back from Docker.
<filter raw.containers.**>
@type concat
key log
use_partial_cri_logtag true
partial_cri_logtag_key logtag
partial_cri_stream_key stream
separator ""
</filter>
How should fluent-plugin-concat, or for that matter, fluentd in general, be configured to re-assemble these split log messages before further processing?
I have a answer for you but you will not like it.
First things first...
These two options do not work because it is not an option for kubernetes. These two work only with pure docker / docker swarm. But you are using kubernetes in EKS 1.22 (I suppose that you have default runtime environment containerd on background).
<filter raw.containers.**>
@type concat
key log
use_partial_metadata true
separator ""
</filter>
<filter raw.containers.**>
@type concat
key log
multiline_end_regexp /\\n$/
separator ""
</filter>
This one should work, but it will work only if you change your runtime environment from containerd to CRI-O. Because only CRI-O runtime environment adds flag "F" / "P" in kubernetes to your logs. This flag is necessary for merging logs by logtag feature.
<filter raw.containers.**>
@type concat
key log
use_partial_cri_logtag true
partial_cri_logtag_key logtag
partial_cri_stream_key stream
separator ""
</filter>
Long story short: If you want to merge logs in k8s cluster which were splitted by docker because of max size 16kb, you need to use CRI-O container runtime inside k8s cluster and use logtag setup in fluentd.