jsonelasticsearchfilebeat

Configuring Filebeat to ingest and parse .ndjson files


I'm a newbie in this Elasticsearch, Kibana and Filebeat thing.

I got the info about how to make Filebeat to ingest JSON files into Elasticsearch, using the decode_json_fields configuration (in the filebeat.yml, as described in https://www.elastic.co/guide/en/beats/filebeat/current/decode-json-fields.html [1] and a sample provided in https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html#decode-json-example [2]).

However, even following the example provided in [2], I did not get the expected result. I even found an old entry (https://www.elastic.co/blog/structured-logging-filebeat [3]), but had no better outcome.

This is one (filebeat.yml) of the attempts ...:

filebeat.inputs:
- type: filestream
  paths: 
    - c:\\tmp\\tmf\\events-*.ndjson
  json.keys_under_root: true
  json.add_error_key: true
  #json.message_key: event
  fields: ["inner"]

output.elasticsearch:
  hosts: ["localhost:9200"]
  protocol: "https"
  username: ...
  password: ...

... and this is another:

filebeat.inputs:
- type: filestream
  paths: 
    - c:\\tmp\\tmf\\events-*.ndjson
  #json.keys_under_root: true
  #json.add_error_key: true
  #json.message_key: event
  #fields: ["inner"]

processors:
  - decode_json_fields:
      fields: [ "outer", "inner" ]
      max_depth: 1
      target: ""
      add_error_key: true

And this is my sample JSON log file:

{ "outer": "value", "inner": "{\"data\": \"value\"}" }

And this is how it looks when queried through the http://localhost:5601/app/dev_tools#/console:

GET /.ds-filebeat-8.3.2-2022.09.28-000001/_search?_source_excludes=ecs,host,os
{
  "query": {
    "match": {
      "log.file.path": {
        "query": """c:\tmp\tmf\events-20220930-11.ndjson"""
      }
    }
  }
}
{
  "hits": {
    "hits": [
      {
        "_source": {
          "input": {
            "type": "filestream"
          },
          "@timestamp": "2022-09-30T23:26:01.087Z",
          },
          "message": """{ "outer": "value", "inner": "{\"data\": \"value\"}" }""",
        }
      }
    ]
  }
}

And no matter I tweak the filebeat.yml config file, the JSON file doesn get parsed at all.

Any help is greaty appreciated.

References:
1.https://www.elastic.co/guide/en/beats/filebeat/current/decode-json-fields.html
2.https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html#decode-json-example
3. https://www.elastic.co/blog/structured-logging-filebeat


Solution

  • Tldr;

    You are not providing filebeat with the knowledge of the file layout.

    You should probably provide filebeat with a parser.

    Solution

    filebeat.inputs:
    - type: filestream
      id: events
      paths: 
        - c:\\tmp\\tmf\\events-*.ndjson
      parsers:
        - ndjson:
            target: ""
            add_error_key: true