I will publish some Docker containers incorporating a common logging framework (written in golang). The logging format is a JSON format.
There is distinct data in this custom json logging format that I would like to be index/searchable from with Kibana. My understanding is that I need to transform/filter this data, but I'm struggling to understand how this is done even after RTFM. I have to extract JSON from JSON?
Some example output from minimal sample application as seen in Docker logs:
{"app_name":"SampleApp","app_port":6666,"app_version":"0.0.2","file":"/build/examples/sample/app/runmain/main.go:131","func":"example.com/code/microservices/examples/sample/app/runmain.mainErr.func1","fw_version":"v0.0.1","level":"info","msg":"listening","time":"2024-01-18T20:31:39.163970213+07:00"}
The data makes its way into fluentd and is logged as:
2024-01-18 20:31:39.000000000 +0000 f905b090d278: {"container_id":"f905b090d278ec2cc2f1f912acdbf8787a0a1c91d8ab7b00ad84e9da20c8c147","container_name":"/fervent_jemison","source":"stdout","log":"{\"app_name\":\"SampleApp\",\"app_port\":6666,\"app_version\":\"0.0.2\",\"file\":\"/build/examples/sample/app/runmain/main.go:131\",\"func\":\"example.com/code/microservices/examples/sample/app/runmain.mainErr.func1\",\"fw_version\":\"v0.0.1\",\"level\":\"info\",\"msg\":\"listening\",\"time\":\"2024-01-18T20:31:39.163970213+07:00\"}\r"}
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<filter docker.**>
@type parser
key_name log
reserve_data true
<parse>
@type json
</parse>
</filter>
<match *.**>
@type copy
<store>
@type elasticsearch
host es
port 9200
user elastic
password elastic
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key @log_name
<buffer>
flush_interval 1s
</buffer>
</store>
<store>
@type stdout
</store>
</match>
Initially, I'll deploy this all locally on a single host.
Any tips or further direction would be greatly appreciated. This is a new world to me.
you are dealing with logs sent and received by the forward directive, so you cannot simply parse JSON.
<filter docker.**>
@type parser
key_name log
reserve_data true
<parse>
@type json
</parse>
</filter>
Instead, you should use regular expressions. It seems like you've done your job with the logger, so the log will be easily parsable.
Im testing, and this regexp rule should work for you
{"app_name":"(?<app_name>[^"]*)","app_port":(?<app_port>[^"]*),"app_version":"(?<app_version>[^"]*)","file":"(?<file>[^"]*)","func":"(?<func>[^"]*)","fw_version":"(?<fw_version>[^"]*)","level":"(?<level>[^"]*)","msg":"(?<msg>[^"]*)","time":"(?<time>[^"]*)
As you said you are new to fluentd I can only recomend you to test your parse regexp using this 2 webs
https://fluentular.herokuapp.com/parse?regexp=%7B%22app_name%22%3A%22%28%3F%3Capp_name%3E%5B%5E%22%5D*%29%22%2C%22app_port%22%3A%28%3F%3Capp_port%3E%5B%5E%22%5D*%29%2C%22app_version%22%3A%22%28%3F%3Capp_version%3E%5B%5E%22%5D*%29%22%2C%22file%22%3A%22%28%3F%3Cfile%3E%5B%5E%22%5D*%29%22%2C%22func%22%3A%22%28%3F%3Cfunc%3E%5B%5E%22%5D*%29%22%2C%22fw_version%22%3A%22%28%3F%3Cfw_version%3E%5B%5E%22%5D*%29%22%2C%22level%22%3A%22%28%3F%3Clevel%3E%5B%5E%22%5D*%29%22%2C%22msg%22%3A%22%28%3F%3Cmsg%3E%5B%5E%22%5D*%29%22%2C%22time%22%3A%22%28%3F%3Ctime%3E%5B%5E%22%5D*%29&input=%7B%22app_name%22%3A%22SampleApp%22%2C%22app_port%22%3A6666%2C%22app_version%22%3A%220.0.2%22%2C%22file%22%3A%22%2Fbuild%2Fexamples%2Fsample%2Fapp%2Frunmain%2Fmain.go%3A131%22%2C%22func%22%3A%22example.com%2Fcode%2Fmicroservices%2Fexamples%2Fsample%2Fapp%2Frunmain.mainErr.func1%22%2C%22fw_version%22%3A%22v0.0.1%22%2C%22level%22%3A%22info%22%2C%22msg%22%3A%22listening%22%2C%22time%22%3A%222024-01-18T20%3A31%3A39.163970213%2B07%3A00%22%7D&time_format= and https://regex101.com/