jsoncountsnowflake-cloud-data-platformjson-flattener

Query to count all records without certain key in the json column of the snowflake table


I am trying to fetch the count the number of records from a Snowflake table without certain keys in the json column of that particular record.

Here’s how the snowflake table looks like :

EMP_ID|DEPARTMENT_NAME|EID|DETAILS
EMP10001 | Finance |10008918 |{                                                                                 
"name": "Alec George",                                                        
"Year_Joined": "2013",
"Ready_to_transfer":  "no",
"Ready_to_permanently_WFH":  "yes",
}

Now I want to count records that doesn’t have have the keys that start with Ready_ in the details column of the snowflake table and group counts by the Department_Name.

Note : There can be multiple keys that start with Ready_ in the details.

Currently what’s happening is my count query is returning records where keys start with Ready_ is also listed.


Solution

  • You can flatten to get all the keys, then for each record you can count the number of keys that start with your desired string:

    with data as (
    select $1 emp_id, $2 dep, $3 eid, parse_json($4) details
    from (values
    ('EMP10001','Finance', 10008918, '{
    "name": "Alec George",
    "Year_Joined": "2013", "Ready_to_transfer": "no", "Ready_to_permanently_WFH": "yes", }')
    ,('EMP10002','Finance', 10008918, '{
    "name": "Alex George",
    "Year_Joined": "2013", }')
    )
    )
    
    select seq, count_if(detail.key like 'Ready_%') how_many_ready
    from data, table(flatten(details)) detail
    group by 1
    ;
    

    enter image description here

    Then you only need to count the # of elements that have a count > 0.