sqlarraysjsongoogle-bigquery

Bigquery transform dictionary to array


I have a Bigquery table with a 2 fields: uid (string) and cart (json), the json field has random keys (the name of the keys are not predictable). Example of a row:

uid                 |            cart
____________________________________________________
 some_uid           |   {
                    |     "random_abc": {
                    |        "field1": "value1",
                    |        "field2": "value2",
                    |       },
                    |     "random_xyz": {
                    |        "field1": "value1",
                    |        "field2": "value2",
                    |     }
                    |   }

I'd like to transform cart to be an array as follow:

uid                 |            cart
____________________________________________________
 some_uid           |   [
                    |     {
                    |        "key": "random_abc"
                    |        "field1": "value1",
                    |        "field2": "value2",
                    |     },
                    |     {
                    |        "key": "random_xyz"
                    |        "field1": "value1",
                    |        "field2": "value2",
                    |     }
                    |   ]

I'm struggling to loop over cart keys, since the naming is random


Solution

  • The call to json_keys(cart,1) returns an array of keys with a maximum depth of 1, so it will return only the random... keys. The array is unnested and joined to the table and json_set() call adds a key called key with random_... as its value to.

    Each row consists of the uid and JSON object corresponding to the value of random_... element of cart, with the random_... added under key.

    Finally, grouping by the uid joins the JSON objects into an array.

    with your_table as (
       select 'uid1' as uid, JSON '''
              {
                "random_abc":
                {
                    "field1": "value1",
                    "field2": "value2"
                },
                "random_xyz":
                {
                    "field1": "value1",
                    "field2": "value2"
                }
              } 
       ''' cart
    )
    select sq.uid, array_agg(rec) cart
      from (
            select yt.uid,
                   to_json(json_set(cart[key],'$.key', key, create_if_missing => true)) rec
              from your_table yt,
                   unnest(json_keys(cart,1)) key
    ) sq
    group by sq.uid
    

    The final result is this:

    |   uid    |          cart           |
    |----------|-------------------------|
    | some_uid | [                       |
    |          |  {                      |
    |          |    "field1": "value1",  | 
    |          |    "field2": "value1",  |
    |          |    "key": "random_xyz"  |
    |          |  },                     |
    |          |  {                      |
    |          |    "field1": "value1",  |
    |          |    "field2": "value2",  |
    |          |    "key": "random_abc"  |
    |          |  }                      |
    |          | ]                       |