sqlsql-serverxmlxquerysqlxml

Use XML as a SQL table


Using SQL Server 2016, primarily.

I have data available to me that is formatted in XML, but I need to be able to join this data to other tables in a SQL Server database.

I have looked around, but the examples I have found use a different data layout in the XML. The examples have distinct node names in the document. My XML has a node named metadata that defines the name, data type, and length of each data item. In the data node, each row has value nodes that correspond with the item nodes in the metadata node. Also, my XML uses namespacing. None of the examples I have found do. I have had problems with namespacing in other SQL/XML tasks I have worked through, so that is probably significant.

How can I use SQL to convert something like this...

<?xml version="1.0" encoding="utf-8"?>
<dataset  xmlns="http://developer.cognos.com/schemas/xmldata/1/"  xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<!--
<dataset
    xmlns="http://developer.cognos.com/schemas/xmldata/1/"
    xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
    xs:schemaLocation="http://developer.cognos.com/schemas/xmldata/1/ xmldata.xsd"
>
-->
    <metadata>
          <item name="Task" type="xs:string" length="-1"/>
          <item name="Task Number" type="xs:string" length="-1"/>
          <item name="Group" type="xs:string" length="-1"/>
          <item name="Work Order" type="xs:string" length="-1"/>
    </metadata>
    <data>
        <row>
            <value>3361B11</value>
            <value>1</value>
            <value>01</value>
            <value>MS7579</value>
        </row>
        <row>
            <value>3361B11</value>
            <value>2</value>
            <value>50</value>
            <value>MS7579</value>
        </row>
        <row>
            <value>3361B11</value>
            <value>3</value>
            <value>02</value>
            <value>JA0520</value>
        </row>
    </data>
</dataset>

...into a tabular format like...

Task Task Number Group Work Order
3361B11 1 01 MS7579
3361B11 2 50 MS7579
3361B11 3 02 JA0520

...so I can join it to other data.

I think the first step would be to query the metadata node to get the column name, data type, and length.

input (boiled down)

<?xml version="1.0" encoding="utf-8"?>
<dataset  xmlns="http://developer.cognos.com/schemas/xmldata/1/"  xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
    <metadata>
          <item name="Task" type="xs:string" length="-1"/>
          <item name="Task Number" type="xs:string" length="-1"/>
          <item name="Group" type="xs:string" length="-1"/>
          <item name="Work Order" type="xs:string" length="-1"/>
    </metadata>
</dataset>

output

id name type length
1 Task string -1
2 Task Number string -1
3 Group string -1
4 Work Order string -1

Once that is obtained, the next part would be generated dynamically using those values.

In case it is helpful for fiddling, here is a SQL statement that produces the desired result:

with
xlTask as (
  select *
  from (
    values
      ('3361B11', '1', '01', 'MS7579')
    , ('3361B11', '2', '50', 'MS7579')
    , ('3361B11', '3', '02', 'JA0520')
  ) q ([Task], [Task Number], [Group], [Work Order])
)
select *
from xlTask

Solution

  • Here's a dynamic solution from your sample data:

    DECLARE @SQL NVARCHAR(MAX)
    
    ;WITH XMLNAMESPACES(DEFAULT 'http://developer.cognos.com/schemas/xmldata/1/')
    SELECT  @SQL = N'WITH XMLNAMESPACES(DEFAULT ''http://developer.cognos.com/schemas/xmldata/1/'')
    SELECT '
     + STRING_AGG('c.value(''(value[' + T.X.value('let $i := . return count(/dataset/metadata/item[. << $i]) + 1', 'NVARCHAR(MAX)') + N']/text())[1]'', ''VARCHAR(20)'') AS '
     + QUOTENAME(T.X.value('@name', 'nvarchar(100)')) , ',')
     + N'
    FROM @xml.nodes(''/dataset/data/row'') AS t(c)'
    FROM    @xml.nodes('/dataset/metadata/item') as T(X);
    EXEC SP_EXECUTESQL @SQL, N'@xml XML', @xml = @xml
    

    It reads the metadata and generates sql that reads actual rows and creates proper aliases