[SOLVED] Negating from symbol until the previous space

Negating from symbol until the previous space

Trying to use logstash grok filters (oniguruma regex) to filter some logs. For a log entry that looks like this:

2019-03-24 17:57:14,202 p=19455 u=root |  TASK [this is the task name msg=Debug message] ************************

I have written this filter:

%{DATE:date}\s%{TIME:time}\sp=(?<id>[\d]+)\su=(?<user>[\w]+)\s\|\s*TASK\s*\[(?<task>[^=]*)

Difficulty for me here, is that I need to match "task" label to exactly this: "this is the task name". At this time "task" matches ""this is the task name msg". And of course, this is only an example and the words themselves change from example to example.

This is an ansible log, which for some reason mixes the task name and the tasks themselves in the same log line and only uses spaces to separate them. In all the cases, I know that the task name has finished and the task details are showing, because of the "=" symbol.

So I would need to match until a "=" is found, and then negate the word behind it, in this case is "msg" (depending on the task, this word could also change).

Any ideas how to accomplish this? Thanks!

Solution

You may use

%{DATE:date}\s%{TIME:time}\su=(?<user>\w+)\s\|\s*TASK\s*\[(?<task>[^\]=]*)\s\w+=

See the regex demo

The (?<task>[^\]=]*)\s\w+= part is of interest:

(?<task>[^\]=]*) - Group named "task": [^\]=]* matches any 0+ chars other than ] and =
\s - one whitespace
\w+ - 1+ word chars
= - a = char