regexlogstashlogstash-grokoniguruma

Negating from symbol until the previous space


Trying to use logstash grok filters (oniguruma regex) to filter some logs. For a log entry that looks like this:

2019-03-24 17:57:14,202 p=19455 u=root |  TASK [this is the task name msg=Debug message] ************************

I have written this filter:

%{DATE:date}\s%{TIME:time}\sp=(?<id>[\d]+)\su=(?<user>[\w]+)\s\|\s*TASK\s*\[(?<task>[^=]*)

Difficulty for me here, is that I need to match "task" label to exactly this: "this is the task name". At this time "task" matches ""this is the task name msg". And of course, this is only an example and the words themselves change from example to example.

This is an ansible log, which for some reason mixes the task name and the tasks themselves in the same log line and only uses spaces to separate them. In all the cases, I know that the task name has finished and the task details are showing, because of the "=" symbol.

So I would need to match until a "=" is found, and then negate the word behind it, in this case is "msg" (depending on the task, this word could also change).

Any ideas how to accomplish this? Thanks!


Solution

  • You may use

    %{DATE:date}\s%{TIME:time}\su=(?<user>\w+)\s\|\s*TASK\s*\[(?<task>[^\]=]*)\s\w+=
    

    See the regex demo

    The (?<task>[^\]=]*)\s\w+= part is of interest: