google-cloud-platformgoogle-bigquerylog-analysis

Extract a string value from a JSON field in GCP log analytics


I am working on building some stats using log analytics in GCP where I am writing a SQL query to extract a value from a JSON field.

I tried few queries but is always returns null. I think I am doing something wrong in the regex.

Here is the sample data

responseContent: ""[{\"caseID\":\"1234fhdvdhkdbb\",\"reference\":\"aeb6c134-13e6-4f37-9507-9d4b7eeaf0e4\"}]""

And here is my query -

SELECT
  timestamp, severity, resource.type, log_name, text_payload, proto_payload, json_payload, REGEXP_EXTRACT(JSON_VALUE(json_payload.proxy_details.responseContent), r'reference:\s*(\d*\.?\d+)') as name
FROM
  `my-proj.global._Default._Default` AS logs

Can someone advise on what is being missed here? Many Thanks in Advance.


Solution

  • As commented by @Shashank Agrawal, the following query could resolve the issue.

    SELECT timestamp, severity, resource.type, log_name, text_payload, proto_payload, json_payload, REGEXP_EXTRACT(JSON_VALUE(json_payload.proxy_details.responseContent), r'reference\\":\\"(.*)\\') as name,

    Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future. Feel free to edit this answer for additional information.