elasticsearchlogstashelastic-stacklogstash-grok

Grok syntax and distinct if it is a new entry or continuation of previuoys one


I am using Grok as a default plugin for filtering my log, so let me say I have a simple 3 log entries:

2023-08-17 10:10:50.751 +02:00 [WARNING] [Provider] Failed to collect
2023-08-17 10:10:50.751 +02:00 [Error] [Provider] Failed to collect
AdapterException: Connection from Adapter to turbine could not be established
   at IsReadyAsync(CancellationToken token) in C:\server\Connection.cs:line 403
   at AlarmsAsync(CancellationToken token) in C:\server\Connection.cs:line 242
   at AlarmsAsync(CancellationToken token) in C:\server\Connection.cs:line 256
   at EventsAsync(Unit unit, LiveEventSubscriptionData eventData, CancellationToken token) in C:\server\Events.cs:line 55
2023-08-17 10:10:50.751 +02:00 [WARNING] [Provider] Failed to collect

So to support it with multi-lines exceptions reading I have created this Grok Expression:

filter {
      grok {
        match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{ISO8601_TIMEZONE:timezone} \[%{WORD:level}\] \[%{GREEDYDATA:source}\] %{GREEDYDATA:message}(?<message>(.|\r|\n)*)" }
      }
}

But now Here lies the problem, it reads 3 entrires as one like:

{
  "source": "Provider",
  "message": [
    "Failed to collect",
    "\n2023-08-17 10:10:50.751 +02:00 [Error] [Provider] Failed to collect\nAdapterException: Connection from Adapter to turbine could not be established\n   at IsReadyAsync(CancellationToken token) in C:\\server\\Connection.cs:line 403\n   at AlarmsAsync(CancellationToken token) in C:\\server\\Connection.cs:line 242\n   at AlarmsAsync(CancellationToken token) in C:\\server\\Connection.cs:line 256\n   at EventsAsync(Unit unit, LiveEventSubscriptionData eventData, CancellationToken token) in C:\\server\\Events.cs:line 55\n2023-08-17 10:10:50.751 +02:00 [WARNING] [Provider] Failed to collect"
  ],
  "level": "WARNING",
  "timezone": "+02:00",
  "timestamp": "2023-08-17 10:10:50.751"
}

Also tried adding multiline codec with no luck:

input {
 file {
   mode => "tail"
   path => "/usr/share/logstash/ingest_data/*"
   codec => multiline {
       pattern => "%{TIMESTAMP_ISO8601}"
       negate => true
       what => "previous"
   }
 }
}

So as I am stuck is there a way to inform expression witch line is configuration and with one is new log entry?


Solution

  • So I figures this one out to support multi lines in Logstash in config there is a need for codex in input file configuration that looks like:

    codec => multiline {
       pattern => "^%{TIMESTAMP_ISO8601}" <- pattern defining how new line starts
       negate => true
       what => "previous"
    }
    

    and so filter for this would look like:

    %{TIMESTAMP_ISO8601:event_time} %{ISO8601_TIMEZONE:timezone} \[%{WORD:level}\] \[%{GREEDYDATA:source}\] %{GREEDYDATA:message}