sqljsonsnowflake-cloud-data-platformjson-flattener

snowflake json data flatten in existing table fields


I have a snowflake table as below

table name: raw_data

id conv_type transcript
1 chat [{"by":"sys", "text":" can I help you?", "time":"2023-01-01 11:11:11", "comments":""},{"by":"person1", "text":" yes tell me", "time":"2023-01-01 11:12:11", "comments":""}]
2 phone [{"by":"sys", "text":" can I help you with?", "time":"2023-01-01 11:13:11", "comments":""},{"by":"person2", "text":" want to know time now", "time":"2023-01-01 12:11:18", "comments":""},{"by":"sys", "text":" time is something ", "time":"2023-01-01 13:11:11", "comments":" no comments"}]

Need to flattern data like below

id conv_type transcript by text time
1 chat [{"by":"sys", "text":" can I help you?", "time":"2023-01-01 11:11:11", "comments":""},{"by":"person1", "text":" yes tell me", "time":"2023-01-01 11:12:11", "comments":""}] sys can I help you? 2023-01-01 11:11:11
1 chat [{"by":"sys", "text":" can I help you?", "time":"2023-01-01 11:11:11", "comments":""},{"by":"person1", "text":" yes tell me", "time":"2023-01-01 11:12:11", "comments":""}] person1 yes tell me 2023-01-01 11:12:11
2 phone [{"by":"sys", "text":" can I help you with?", "time":"2023-01-01 11:13:11", "comments":""},{"by":"person2", "text":" want to know time now", "time":"2023-01-01 12:11:18", "comments":""},{"by":"sys", "text":" time is something ", "time":"2023-01-01 13:11:11", "comments":" no comments"}] sys can I help you with? 2023-01-01 11:13:11
2 phone [{"by":"sys", "text":" can I help you with?", "time":"2023-01-01 11:13:11", "comments":""},{"by":"person2", "text":" want to know time now", "time":"2023-01-01 12:11:18", "comments":""},{"by":"sys", "text":" time is something ", "time":"2023-01-01 13:11:11", "comments":" no comments"}] person2 want to know time now 2023-01-01 12:11:18
2 phone [{"by":"sys", "text":" can I help you with?", "time":"2023-01-01 11:13:11", "comments":""},{"by":"person2", "text":" want to know time now", "time":"2023-01-01 12:11:18", "comments":""},{"by":"sys", "text":" time is something ", "time":"2023-01-01 13:11:11", "comments":" no comments"}] sys time is something 2023-01-01 13:11:11

I tried to use below code.

raw_data

select
   *,
   a.value:by::varchar as by,
   a.value:text::varchar as text,
   a.value:time::varchar as time
from raw_data
, lateral flatten(input => raw_data:transcript) a;

by,text,time values are showing as null, any help?


Solution

  • You can do it using parse_json() as follows :

    select t.*,
       a.value:by::varchar as "by",
       a.value:text::varchar as "text",
       a.value:time::varchar as "time"
    from raw_data t
    , lateral flatten(input => parse_json(t.transcript)) a