amazon-web-servicesamazon-cloudwatchaws-cloudwatch-log-insights

Unnesting a comma seperated string in AWS cloudwatch insights query


I have a log message like this

{"level":"INFO","location":"log_missing_items:xxx","message":"Missing_Items: Item1, Item2, Item3 ...","timestamp":"2025-01-02 21:46:15,382+0000","service":"xxxx","xray_trace_id":"xxxxxxxxx"}

I am trying to write a cloudwatch insights query to get only the missing items as rows so that it can be displayed in a cloudwatch insights query widget in a dashboard.

Expected result

Item1
Item2
Item3

is this possible with AWS cloudwatch Insights query ?


Solution

  • You can now use OpenSearch SQL language introduced in december 2024 during the AWS re:Invent 2024 to query your logs with Cloudwatch logs insight. You will find more details in this blog written by Elizabeth Fuentes and her team.

    Here's an SQL example to achieve your needs:

    SELECT distinct get(split(regexp_extract(`@message`, 'Missing_Items: (.*)'), '[,]'), b.index) as missingItem
    FROM `LogGroupA` a
    INNER JOIN (
        SELECT ROW_NUMBER() OVER w - 1 index, MAX(regexp_count(regexp_extract(`@message`, 'Missing_Items: (.*)'), '[^,]+')) OVER w max_missing_items
        FROM `LogGroupA`
        WINDOW w AS (ORDER BY id)
    ) b
    WHERE get(split(regexp_extract(`@message`, 'Missing_Items: (.*)'), '[,]'), b.index) <> ''
    AND b.index < b.max_missing_items
    

    That query could have been much easier if Generator functions were supported by CloudWatch Logs. The explode function for example would do it directly (Something like SELECT distinct explode(split(regexp_extract(`@message`, 'Missing_Items: (.*)'), '[,]')) as missingItem FROM `LogGroupA` ).

    More details on supported OpenSearch SQL commands and functions here.