sqlarraysjsonsnowflake-cloud-data-platformflatten

SQL query nested arrays JSON Snowflake


I am trying to query a table in Snowflake that is built with a single field labeled "Value" and each row is a separate JSON object.

Row 1: {
  "artist": {
    "artistID": "artist_1",
    "artistName": "Roblox"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84121",
        "weight": "63"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "240509",
   "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_1"
    }
  ]
}

Row 2: {
  "artist": {
    "artistID": "artist_2",
    "artistName": "Minecraft"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84122",
        "weight": "12"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "400001",
   "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_2"
    }
  ]
}

Row 3: {
  "artist": {
    "artistID": "artist_3",
    "artistName": "Fortnite"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84121",
        "weight": "47"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "300001",
  "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_3"
    },
    {
      "type": "ISRC",
      "value": "isrc_4"
    }
  ]
}

What I am trying to do is SELECT a column which includes all of the ISRCs, and another column which includes their associated styles. As you can see, some rows can have multiple ISRCs, and each row can have multiple styles. The output dataframe should look like this:

enter image description here

I'm having a hard time wrapping my head around the nested arrays - can you point me in the right direction? Thank you!


Solution

  • this worked for me :

    WITH json_data AS (
      SELECT 
        PARSE_JSON('{
          "artist": {
            "artistID": "artist_1",
            "artistName": "Roblox"
          },
          "descriptors": {
            "styles": [
              {
                "ID": "84121",
                "weight": "63"
              },
              {
                "ID": "83983",
                "weight": "14"
              }
            ],
            "duration": "240509",
            "productCodes": [
              {
                "type": "ISRC",
                "value": "isrc_1"
              }
            ]
          }
        }') AS data
    )
    
    SELECT 
      pc.value:value::string AS isrc,
      s.value:ID::string AS style_id,
      s.value:weight::string AS style_weight
    FROM json_data jd,
      LATERAL FLATTEN(input => jd.data:descriptors:productCodes) pc,
      LATERAL FLATTEN(input => jd.data:descriptors:styles) s
    WHERE pc.value:type::string = 'ISRC';
    

    Output

    enter image description here