logstash-groklogstash-configurationgroklogstash-filter

Create a custom grok pattern


I was working with logstash to structure the following type of logs:

14 Apr 2020 22:49:02,868 [INFO] 1932a8e0-3892-4bae-81e3-1fc1850dff55-LPmAoB (coral-client-orchestrator-41786) hub_delivery_audit: RequestContext{CONTAINER_ID=200414224842439045902810201AZ, TRACKING_ID=TSTJ8N7GLBS0ZZW, PHYSICAL_ATTRIBUTES=PhysicalAttributes(length=Dimension(value=30.0, unit=CM, type=null), width=Dimension(value=30.0, unit=CM, type=null), height=Dimension(value=30.0, unit=CM, type=null), scaleWeight=Weight(value=5.0, unit=kg, type=null)), SHIP_METHOD=AMZN_US_PRIME, ADDRESS_ID=LDI7ICATBZNOAQNW634MG057BMA07370713J4ZQ1VGOMB7KPXTQ2EIA2OX4CKT7L, CUSTOMER_ID=A07370713J4ZQ1VGOMB7K, REQUEST_STATE=UNKNOWN, RESPONSE=GetAccessPointsForHubDeliveryOutput(destinationLocation=null, fallBackLocation=null, capability=null), IS_COMMERCIAL_ATTRIBUTE_PRESENT=false}

and I wanted to extract the following data out of it:

CONTAINER_ID

TRACKING_ID

PHYSICAL_ATTRIBUTES

SHIP_METHOD

ADDRESS_ID

REQUEST_STATE

RESPONSE

But I'm not able to figure out appropriate filter for such large log event. I've tried using https://grokdebug.herokuapp.com/ and went through Logstash grok documentation as well, but still couldn't extract the required fields. I could only come up with this:

%{MONTHDAY:monthday} %{MONTH:month} %{YEAR:year} %{TIME:time} [%{LOGLEVEL:logLevel}] %{HOSTNAME}

Please suggest an approach on this and how to directly filter the following fields without creating extra fields like time and date.


Solution

  • I have tried the following grok pattern

    {CONTAINER_ID=%{DATA:container_id}, TRACKING_ID=%{DATA:tracking_id}, PHYSICAL_ATTRIBUTES=PhysicalAttributes%{DATA:physical_attributes} SHIP_METHOD=%{DATA:ship_method}, ADDRESS_ID=%{DATA:address_id}, CUSTOMER_ID=%{DATA:customer_id}, REQUEST_STATE=%{DATA:request_state}, RESPONSE=%{GREEDYDATA:response}(?=,)
    

    in grok debugger (https://grokdebug.herokuapp.com/)

    Output: enter image description here