snowflake-cloud-data-platformflattensnowflake-schemalateral

Lateral Flatten Snowflake from a Variant table


I have a variant table called raw_json, that houses multiple json files, which are unique to the ID but contain similar data points within each json. They live within the jsontext column. Here is a SS for context. I am trying to flatten each row of the raw_json table into a normal table view. The two arrays indexes need to align to assign the right values.

table name is raw_json

Here are two rows from the raw_json table and how the json is structured.

{
 "ID": "PO-103",
 "content": {
   "EEList": [
     {
       "EEListID": "PO-103-1",
       "EEProductID": "XXX1976",

     },
     {
       "EEListID": "PO-103-2",
       "EEProductID": "XXX1977",
     },
     {
       "EEListID": "PO-103-3",
       "EEProductID": "XXX1978",
     }
   ],
   "EENotesList": [
     {
       "FirstName": "John",
       "LastName": "Smith",
       "pxObjClass": "XX-XXSales-Work-XX"
     },
     {
       "FirstName": "Bob",
       "LastName": "Joe",
       "pxObjClass": "XX-XXSales-Work-XX"
     },
     {
       "FirstName": "Mike",
       "LastName": "Smith",
       "pxObjClass": "XX-XXSales-Work-XX"
     }
   ],
 }
}
{
  "ID": "PO-104",
  "content": {
    "EEList": [
      {
        "EEListID": "PO-104-1",
        "EEProductID": "XXX1979",

      },
      {
        "EEListID": "PO-104-2",
        "EEProductID": "XXX1980",
      },
      {
        "EEListID": "PO-104-3",
        "EEProductID": "XXX1981",
      }
    ],
    "EENotesList": [
      {
        "FirstName": "Sarah",
        "LastName": "Butler",
        "pxObjClass": "XX-XXSales-Work-XX"
      },
      {
        "FirstName": "Jessica",
        "LastName": "Adams",
        "pxObjClass": "XX-XXSales-Work-XX"
      }
    ],
  }
}

into a table like this (I need this)

+--------+----------+-------------+-----------+----------+-------------------+
|   ID   | EEListID | EEProductID | FirstName | LastName |    pxObjClass     |
+--------+----------+-------------+-----------+----------+-------------------+
| PO-103 | PO-103-1 | XXX1976     | John      | Smith    | X-XXSales-Work-XX |
| PO-103 | PO-103-2 | XXX1977     | Bob       | Joe      | X-XXSales-Work-XX |
| PO-103 | PO-103-3 | XXX1978     | Mike      | Smith    | X-XXSales-Work-XX |
| PO-104 | PO-104-1 | XXX1979     | Sarah     | Butler   | X-XXSales-Work-XX |
| PO-104 | PO-104-2 | XXX1980     | Jessica   | Adams    | X-XXSales-Work-XX |
+--------+----------+-------------+-----------+----------+-------------------+

I have been able to flatten the EENoteList array into a table and assign the right ID to that table. Here is my code so far: Adding in EEList values without fanning out the table is where I go wrong.

select
    jsontext:ID::varchar as ID,
    en.value:FirstName::varchar as FirstName,
    en.value:LastName::varchar as LastName,
    en.value:pxObjClass::varchar as pxObjClass
   -- concat(ID, EEProductID, FirstName, LastName)

    from raw_json,
    lateral flatten (input => jsontext:content:EENotesList, outer => false) en;

which produces this table (I have this)

+--------+-----------+----------+-------------------+
|   ID   | FirstName | LastName |    pxObjClass     |
+--------+-----------+----------+-------------------+
| PO-103 | John      | Smith    | X-XXSales-Work-XX |
| PO-103 | Bob       | Joe      | X-XXSales-Work-XX |
| PO-103 | Mike      | Smith    | X-XXSales-Work-XX |
| PO-104 | Sarah     | Butler   | X-XXSales-Work-XX |
| PO-104 | Jessica   | Adams    | X-XXSales-Work-XX |
| PO-104 | Terrence  | Williams | X-XXSales-Work-XX |
+--------+-----------+----------+-------------------+


Solution

  • so mostly this answer is the preamble to get the data into a CTE, but "iff" the order of the two arrays are in lockstep, you can just use the index of the flatten to access the raw array of the other type:

    WITH raw_json AS (
    select PARSE_json(column1) AS jsontext FROM VALUES 
     ('{
     "ID": "PO-103",
     "content": {
       "EEList": [
         {
           "EEListID": "PO-103-1",
           "EEProductID": "XXX1976",
    
         },
         {
           "EEListID": "PO-103-2",
           "EEProductID": "XXX1977",
         },
         {
           "EEListID": "PO-103-3",
           "EEProductID": "XXX1978",
         }
       ],
       "EENotesList": [
         {
           "FirstName": "John",
           "LastName": "Smith",
           "pxObjClass": "XX-XXSales-Work-XX"
         },
         {
           "FirstName": "Bob",
           "LastName": "Joe",
           "pxObjClass": "XX-XXSales-Work-XX"
         },
         {
           "FirstName": "Mike",
           "LastName": "Smith",
           "pxObjClass": "XX-XXSales-Work-XX"
         }
       ],
     }
    }'), ('{
      "ID": "PO-104",
      "content": {
        "EEList": [
          {
            "EEListID": "PO-104-1",
            "EEProductID": "XXX1979",
    
          },
          {
            "EEListID": "PO-104-2",
            "EEProductID": "XXX1980",
          },
          {
            "EEListID": "PO-104-3",
            "EEProductID": "XXX1981",
          }
        ],
        "EENotesList": [
          {
            "FirstName": "Sarah",
            "LastName": "Butler",
            "pxObjClass": "XX-XXSales-Work-XX"
          },
          {
            "FirstName": "Jessica",
            "LastName": "Adams",
            "pxObjClass": "XX-XXSales-Work-XX"
          }
        ],
      }
    }') 
    )
    select
        jsontext:ID::varchar as ID,
        en.value:FirstName::varchar as FirstName,
        en.value:LastName::varchar as LastName,
        en.value:pxObjClass::varchar as pxObjClass,
        jsontext:content.EEList[en.index].EEListID::text as EEListID,
        jsontext:content.EEList[en.index].EEProductID::text as EEProductID
        from raw_json,
        lateral flatten (input => jsontext:content:EENotesList, outer => false) en;
    

    this results in:

    ID  FIRSTNAME   LASTNAME    PXOBJCLASS  EELISTID    EEPRODUCTID
    PO-103  John    Smith   XX-XXSales-Work-XX  PO-103-1    XXX1976
    PO-103  Bob     Joe     XX-XXSales-Work-XX  PO-103-2    XXX1977
    PO-103  Mike    Smith   XX-XXSales-Work-XX  PO-103-3    XXX1978
    PO-104  Sarah   Butler  XX-XXSales-Work-XX  PO-104-1    XXX1979
    PO-104  Jessica Adams   XX-XXSales-Work-XX  PO-104-2    XXX1980