logstashlogstash-grok

Grok Syntax Issues


I'm using filebeat to send log to logstash but I'm having issues with grok syntax on Logstash. I used the grok debugger on Kibanna and manager to come to a solution. The problem is that I can't find the same syntax for Logstash.

The original log:

{"log":"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \"PROPFIND /remote.php/dav/files/xxx@yyyy.com/ HTTP/1.1\" 207 1035 \"-\" \"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\"\n","stream":"stdout","time":"2022-08-22T09:37:54.782377901Z"}

The message receive in Logstash:

"message" => "{\"log\":\"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \\\"PROPFIND /remote.php/dav/files/xxx@yyyy.com/ HTTP/1.1\\\" 207 1035 \\\"-\\\" \\\"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2022-08-22T09:37:54.782377901Z\"}",

The Grok Pattern I used on Grok Debugger (Kibana):

{\\"log\\":\\"%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\\\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\\\\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\\\\\("%{DATA:referrer}\\\\\\") \\\\\\"%{DATA:user-agent}\\\\\\"

The real problem is that I can't even manage to get the IP (188.188.188.188). I tried:

match => { "message" => '{\\"log\\":\\"%{IPORHOST:clientip}' # backslash to escape the backslash
match => { "message" => '{\\\"log\\\":\\\"%{IPORHOST:clientip}' # backslash to escape the quote
match => { "message" => "{\\\"log\\\":\\\"%{IPORHOST:clientip}" # backslash to escape the quote

The log used here is shrink. The real log is mixed with JSON and string so I can't send it as JSON in Filebeat.


Solution

  • Ok, so i manage to make it work by using this :

         grok {
                    match => { "message" => '%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:syslog_server} %{WORD:syslog_tag}: %{GREEDYDATA:jsonMessage}' }
            }
         json {
                    source => "jsonMessage"
            }
         grok {
                    match => { "jsonMessage" => '%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\("%{DATA:referrer}\\") \\"%{DATA:user-agent}\\"'}
                    
            }
    

    with log like this :

    Aug 24 00:00:01 hostname containers: {"log":"188.188.188.188 - user.name@things.com [23/Aug/2022:23:59:52 +0200] \"PROPFIND /remote.php/dav/files/ HTTP/1.1\" 207 1159 \"-\" \"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\"\n","stream":"stdout","time":"2022-08-23T21:59:52.612843092Z"}
    

    The first match will fetch the first 3 field (time, hostname and tag), and then get everything after the : with the pattern GREEDYDATA in the jsonMessage. Than the json filter is used on the jsonMessage. Since then, we have the information needed in the new field log created by using the json filter.

    I still don't understand why my grok work on Kibanna debugger but not on Logstash. I mean it's probably because some character needs to be escaped. But even when i escape them it didn't work.