I am trying to configure logstash to manage my various log sources, one of which is Mongrel2. The format used by Mongrel2 is tnetstring
, where a log message will take the form
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
I want to write my own grok patterns to extract certain fields from the above format. I started by testing my regex on the above message here, the regex is
^(?:[^:]*\:){2}([^,]*)
this matches localhost
. When I use the same regex as a grok pattern in the form
TEST ^(?:[^:]*\:){2}([^,]*)
MONGREL %{TEST:test}
and configure logstash with
filter {
grok {
match => [ "message", "%{MONGREL}" ]
}
}
the same regex results in the match 86:9:localhost
. I can't figure out where I am going wrong? Is is that the regex engine I was using to test is based on Python but the grok filter regex is based on Onigurama?
Currently testing it in grokdebug with the following input
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
and the following pattern
(?<hostname>^(?:[^:]*\:){2}([^,]*))
resulting in
{
"hostname": [
[
"86:9:localhost"
]
]
}
where I want
{
"hostname": [
[
"localhost"
]
]
}
A pattern like this will extract the host name:
^(\d+)?:(\d+)?:(?<hostname>[^,]+),
Or writing it in a similar manner that you already wrote it:
^(?:[^:]*\:){2}(?<hostname>[^,]*)
The capture name needs to be inside the parenthesis that you want to capture... your pattern was capturing everything up to that point.