kqlazure-data-explorer

KQL - extract property value from an array of JSON objects, based on the value of another property


Suppose I have a table with some columns. One of the columns is called Details, has a dynamic type, and is an array of JSON objects similar to this:

[
    {
        "key": "Name",
        "value": "Dave"
    },
    {
        "key": "Age",
        "value": "30"
    },
    {
        "key": "Gender",
        "value": "Male"
    }
]

I would like to efficiently pull out the value from the object that has the key = Gender, and store that in a new field called Gender. However - an object with that key won't always be in the field, and if that is the case I want to default it to Undefined in my extracted property.

So if I had:

datatable(d:dynamic)
[
    dynamic({
        "Details":[
            {"value":"Dave","key":"Name"},
            {"value":"Male","key":"Gender"},
            {"value":"30","key":"Age"}
        ]
    }),
    dynamic({
        "Details":[
            {"value":"Dan","key":"Name"},
            {"value":"40","key":"Age"}
        ]
    }),
]

I would want my two rows to be:

Gender
------
Male
Undefined

The dataset that will be queried will go back up to a year and could contain a million of rows, so that may affect the approach.

I am aware of mv-extend and mv-apply, but I don't think these will work because the entire contents of the array needs to be considered when deciding whether to assign a default value or not.

Thank you :)


Solution

  • below are two options, you can test how they perform for your use case, and if they meet your requirements.

    note that if this is a common query requirement, you may want to extract this information at ingestion time, e.g. using an "ingestion mapping" or an "update policy".

    datatable(d:dynamic)
    [
        dynamic({
            "Details":[
                {"value":"Dave","key":"Name"},
                {"value":"Male","key":"Gender"},
                {"value":"30","key":"Age"}
            ]
        }),
        dynamic({
            "Details":[
                {"value":"Dan","key":"Name"},
                {"value":"40","key":"Age"}
            ]
        }),
    ]
    | project Gender = case (d !has 'Gender', 'Unknown', extract(@'{"value":"([^"]+)","key":"Gender"}', 1, tostring(d)))
    
    Gender
    Male
    Unknown
    datatable(d:dynamic)
    [
        dynamic({
            "Details":[
                {"value":"Dave","key":"Name"},
                {"value":"Male","key":"Gender"},
                {"value":"30","key":"Age"}
            ]
        }),
        dynamic({
            "Details":[
                {"value":"Dan","key":"Name"},
                {"value":"40","key":"Age"}
            ]
        }),
    ]
    | mv-apply details = d.Details on (
        summarize arr = make_set_if(details.value, details.key == "Gender")
        | project Gender = case(isnotnull(arr[0]), tostring(arr[0]), "Undefined")
    )
    | project Gender
    
    Gender
    Male
    Undefined