jsonsnowflake-cloud-data-platformjson-flattener

Snowflake Flatten JSON Multiple Arrays


I'm receiving multiple JSON files and loading them into a Snowflake variant field. I would like to then extract some of the child arrays into a separate table but am having issues with that. Here is an example of the JSON:

{
  "ordernumber": 123,
  "customername": "Smith,John",
  "orderdata": {
    "Order": {
      "ItemCountGroup": {
        "items": {
          "item": [
            {
              "ItemCount": "1.00"
            },
            {
              "ItemCount": "1.00"
            }
          ]
        }
      },
      "ItemDescGroup": {
        "items": {
          "item": [
            {
              "ItemDesc": "Series 100 Food Bucket"
            },
            {
              "ItemDesc": "Series X Leather Gloves"
            }
          ]
        }
      },
      "ItemTypeGroup": {
        "items": {
          "item": [
            {
              "ItemType": "Bucket/Pail"
            },
            {
              "ItemType": "Gloves"
            }
          ]
        }
      },
    }
  }
}

Ideally, I'd like to flatten the table so that it comes out:

ItemCount ItemDesc ItemType OrderNumber
1.00 Series 100 Food Bucket Bucket/Pail 123
1.00 Series X Leather Gloves Gloves 123

I've tried a series of different flatten commands and lateral flattens but usually am getting a row for every combination possible between each of the elements. Is there a way or example out there for me to get these into the required format?


Solution

  • If you have a table named T1 with your JSON in a variant column named V, you can flatten each part in a CTE and join them by the index created during the flattening:

    with ItemCountGroup as
    (
    select v:ordernumber ORDERNUMBER, INDEX IDX, VALUE:ItemCount::number(38,2) ITEM_COUNT from T1, 
        table(flatten(v:orderdata.Order.ItemCountGroup.items.item))
    ),
    ItemDescriptionGroup as
    (
    select INDEX IDX, VALUE:ItemDesc::string ITEM_DESC from T1, 
        table(flatten(v:orderdata.Order.ItemDescGroup.items.item))
    ),
    ItemTypeGroup as
    (
    select INDEX IDX, VALUE:ItemType::string ITEM_TYPE from T1, 
        table(flatten(v:orderdata.Order.ItemTypeGroup.items.item))
    )
    select ITEM_COUNT, ITEM_DESC, ITEM_TYPE, ORDERNUMBER
    from ItemCountGroup IC
        left join ItemDescriptionGroup ID on IC.IDX = ID.IDX
        left join ItemTypeGroup IT on IC.IDX = IT.IDX
    ;
    
    ITEM_COUNT ITEM_DESC ITEM_TYPE ORDERNUMBER
    1 Series 100 Food Bucket Bucket/Pail 123
    1 Series X Leather Gloves Gloves 123