snowflake-cloud-data-platform

Fetching Values from XML in snowflake query which is stored in column


Hi I have one table where XML value is store in the column

<ArrayOfValues xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ValueIncluded>
    <Value1>bf3079af-47be-49ec-ab64-ba1b6e360df9</Value1>
    <Value2>false</Value2>
    <Value3>true</Value3>
    <Value4>1</Value4>
    <Value5>false</Value5>
    <Value6>false</Value6>
    <Value7>GlobalDocument</Value7>
    <Value8>true</Value8>
    <Value9>0</Value9>
  </ValueIncluded>
  <ValueIncluded>
    <Value1>e5d9aa84-dccf-455c-8205-8d6027dcf654</Value1>
    <Value2>false</Value2>
    <Value3>true</Value3>
    <Value4>2</Value4>
    <Value5>false</Value5>
    <Value6>false</Value6>
    <Value7>GlobalDocument</Value7>
    <Value8>true</Value8>
    <Value9>0</Value9>
    <Value10>false</Value10>
    <Value11>true</Value11>
  </ValueIncluded>
</ArrayOfValues>

I need to read the value1,2,3 and so on, I tried their documentation and many attempts but getting the Id of table but failed to read the values from XML.

SELECT ID, PARSE_XML(MyColumn):"ArrayOfValues":"ValueIncluded"[1]:"Value1"::TEXT AS Val1, PARSE_XML(MyColumn):"ArrayOfValues":"ValueIncluded"[1]:"Value2"::TEXT AS Val2, PARSE_XML(MyColumn):"ArrayOfValues":"ValueIncluded"[1]:"Value3"::TEXT AS Val3, PARSE_XML(MyColumn):"ArrayOfValues":"ValueIncluded"[1]:"Value4"::TEXT AS Val4, PARSE_XML(MyColumn):"ArrayOfValues":"ValueIncluded"[1]:"Value5"::TEXT AS Val5 FROM MyTable where id='MyValue123'

I tried more method but did not get any promising result, anyone has any insights can help here.


Solution

  • So, assuming there is "always two ValueIncluded" objects, you can use fixed access like:

    with fake_data(txt) as (
        select * from values ('<ArrayOfValues xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <ValueIncluded>
        <Value1>bf3079af-47be-49ec-ab64-ba1b6e360df9</Value1>
        <Value2>false</Value2>
        <Value3>true</Value3>
        <Value4>1</Value4>
        <Value5>false</Value5>
        <Value6>false</Value6>
        <Value7>GlobalDocument</Value7>
        <Value8>true</Value8>
        <Value9>0</Value9>
      </ValueIncluded>
      <ValueIncluded>
        <Value1>e5d9aa84-dccf-455c-8205-8d6027dcf654</Value1>
        <Value2>false</Value2>
        <Value3>true</Value3>
        <Value4>2</Value4>
        <Value5>false</Value5>
        <Value6>false</Value6>
        <Value7>GlobalDocument</Value7>
        <Value8>true</Value8>
        <Value9>0</Value9>
        <Value10>false</Value10>
        <Value11>true</Value11>
      </ValueIncluded>
    </ArrayOfValues>')
    )
    
    SELECT 
        parse_xml(txt) as xml,
        XMLGET(xml, 'ValueIncluded',0) as ValueIncluded0,
        XMLGET(xml, 'ValueIncluded',1) as ValueIncluded1,
        GET(XMLGET(ValueIncluded0, 'Value1'),'$')::text as val0_1,
        GET(XMLGET(ValueIncluded0, 'Value2'),'$')::text as val0_2,
        GET(XMLGET(ValueIncluded0, 'Value3'),'$')::text as val0_3,
        GET(XMLGET(ValueIncluded1, 'Value1'),'$')::text as val1_1,
        GET(XMLGET(ValueIncluded1, 'Value2'),'$')::text as val1_2,
        GET(XMLGET(ValueIncluded1, 'Value3'),'$')::text as val1_3,
    FROM fake_data
    

    example of results

    For variable array size:

    SELECT 
        v.seq,
        v.index,
        GET(XMLGET(v.Value, 'Value1'),'$')::text as val0_1,
        GET(XMLGET(v.Value, 'Value2'),'$')::text as val0_2,
        GET(XMLGET(v.Value, 'Value3'),'$')::text as val0_3,
    
    FROM fake_data,
    table(flatten(input=>get(parse_xml(txt), '$'))) as v
    

    gives:

    enter image description here