elasticsearchfilebeat

Strange error with empty delimiter in dissect processor in filebeat


I'm trying to dissect the log message and pattern shown in the following error. I validated my input using an dissect-tester by jorgelbg where it works without any issues.

I think its especially strange that the delimiter is empty (``).

# Dissect Pattern
%{level} : %{timestamp} [%{class}] (%{file}) - %{message}
# Log Message
INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms
# Error
2023-06-30T10:31:32.515Z        DEBUG   [processors]    processing/processors.go:128    Fail to apply processor client{dissect=%{level} : %{timestamp} [%{class}] (%{file}) - %{message},field=message,target_prefix=, timestamp=[field=timestamp, target_field=@timestamp, timezone=UTC, layouts=[2006-01-02 15:04:05,999]], add_tags=tag}: could not find delimiter: `` in remaining: `INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms`, (offset: 0)

Is there anything I've got wrong about the pattern?

EDIT

I'm using the following filebeat config:

- type: filestream
  id: converter-logstream-id
  paths:
    - /logs/converter/*.log
  prospector.scanner.exclude_files:
    ["^/logs/content2alert-converter/transfer.log"]
  fields:
    input_source: converter
  processors:
    - dissect:
        tokenizer: "%{level} : %{timestamp} [%{class}] (%{file}) - %{message}"
        # field: "message"
        target_prefix: ""
        # trim_values: left
        overwrite_keys: true
    - timestamp:
        field: timestamp
        layouts:
          - "2006-01-02 15:04:05,999"
        test:
          - "2023-06-28 09:30:14,208"
    - add_tags:
        tags: [converter]
        target: "group"

2

When I'm printing to console instead of elasticsearch I'm getting the following reponse:

{
  "@timestamp": "2023-06-30T12:58:18.067Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.6"
  },
  "group": [
    "converter"
  ],
  "ecs": {
    "version": "1.12.0"
  },
  "host": {
    "name": "host"
  },
  "agent": {
    "name": "host",
    "type": "filebeat",
    "version": "7.17.6",
    "hostname": "filebeat-wind-log-1-cd5mq",
    "ephemeral_id": "0bd88f31-8872-4842-bc1b-cb9f11eb63f0",
    "id": "571b3737-67bb-458d-934a-549204b017f5"
  },
  "log": {
    "offset": 1386,
    "file": {
      "path": "/logs/converter/converter-debug.log"
    },
    "flags": [
      "dissect_parsing_error"
    ]
  },
  "message": "\u001b[34mINFO \u001b[0;39m: 2023-06-30 06:43:10,980 \u001b[1;30m[d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208)\u001b[0;39m - Response erhalten vom GeoServer in 3397ms",
  "input": {
    "type": "filestream"
  },
  "fields": {
    "input_source": "converter"
  }
}

Looks like there is some issue with the characters.

Solution

I ended up using a custom processor javascript script as recommend in this thread to remove the unicode colors (thanks @Val). The used logger is logback but I can't change it there.

In case anyone else is in the same situation, this is the script I'm using which is based on this answer.

- script:
    lang: javascript
    source: >
      function process(event) {
        var originalMsg = event.Get('message')
        var msg = originalMsg.replace(/\x1b\[([0-9,A-Z]{1,2}(;[0-9]{1,2})?(;[0-9]{3})?)?[m|K]?/g, '');
        event.Put("message", msg);
      }

Solution

  • The problem is that you have unicode colors in your logs such as \u001b[34mINFO \u001b and that doesn't match the dissect.

    You first need to fix the logger that creates those log files to not issue those color characters and then it will work.