Using SQL Server 2016, primarily.
I have data available to me that is formatted in XML, but I need to be able to join this data to other tables in a SQL Server database.
I have looked around, but the examples I have found use a different data layout in the XML. The examples have distinct node names in the document. My XML has a node named metadata
that defines the name, data type, and length of each data item
. In the data
node, each row
has value
nodes that correspond with the item
nodes in the metadata
node. Also, my XML uses namespacing. None of the examples I have found do. I have had problems with namespacing in other SQL/XML tasks I have worked through, so that is probably significant.
How can I use SQL to convert something like this...
<?xml version="1.0" encoding="utf-8"?>
<dataset xmlns="http://developer.cognos.com/schemas/xmldata/1/" xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<!--
<dataset
xmlns="http://developer.cognos.com/schemas/xmldata/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xs:schemaLocation="http://developer.cognos.com/schemas/xmldata/1/ xmldata.xsd"
>
-->
<metadata>
<item name="Task" type="xs:string" length="-1"/>
<item name="Task Number" type="xs:string" length="-1"/>
<item name="Group" type="xs:string" length="-1"/>
<item name="Work Order" type="xs:string" length="-1"/>
</metadata>
<data>
<row>
<value>3361B11</value>
<value>1</value>
<value>01</value>
<value>MS7579</value>
</row>
<row>
<value>3361B11</value>
<value>2</value>
<value>50</value>
<value>MS7579</value>
</row>
<row>
<value>3361B11</value>
<value>3</value>
<value>02</value>
<value>JA0520</value>
</row>
</data>
</dataset>
...into a tabular format like...
Task | Task Number | Group | Work Order |
---|---|---|---|
3361B11 | 1 | 01 | MS7579 |
3361B11 | 2 | 50 | MS7579 |
3361B11 | 3 | 02 | JA0520 |
...so I can join it to other data.
I think the first step would be to query the metadata
node to get the column name, data type, and length.
input (boiled down)
<?xml version="1.0" encoding="utf-8"?>
<dataset xmlns="http://developer.cognos.com/schemas/xmldata/1/" xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
<item name="Task" type="xs:string" length="-1"/>
<item name="Task Number" type="xs:string" length="-1"/>
<item name="Group" type="xs:string" length="-1"/>
<item name="Work Order" type="xs:string" length="-1"/>
</metadata>
</dataset>
output
id | name | type | length |
---|---|---|---|
1 | Task | string | -1 |
2 | Task Number | string | -1 |
3 | Group | string | -1 |
4 | Work Order | string | -1 |
Once that is obtained, the next part would be generated dynamically using those values.
In case it is helpful for fiddling, here is a SQL statement that produces the desired result:
with
xlTask as (
select *
from (
values
('3361B11', '1', '01', 'MS7579')
, ('3361B11', '2', '50', 'MS7579')
, ('3361B11', '3', '02', 'JA0520')
) q ([Task], [Task Number], [Group], [Work Order])
)
select *
from xlTask
Here's a dynamic solution from your sample data:
DECLARE @SQL NVARCHAR(MAX)
;WITH XMLNAMESPACES(DEFAULT 'http://developer.cognos.com/schemas/xmldata/1/')
SELECT @SQL = N'WITH XMLNAMESPACES(DEFAULT ''http://developer.cognos.com/schemas/xmldata/1/'')
SELECT '
+ STRING_AGG('c.value(''(value[' + T.X.value('let $i := . return count(/dataset/metadata/item[. << $i]) + 1', 'NVARCHAR(MAX)') + N']/text())[1]'', ''VARCHAR(20)'') AS '
+ QUOTENAME(T.X.value('@name', 'nvarchar(100)')) , ',')
+ N'
FROM @xml.nodes(''/dataset/data/row'') AS t(c)'
FROM @xml.nodes('/dataset/metadata/item') as T(X);
EXEC SP_EXECUTESQL @SQL, N'@xml XML', @xml = @xml
It reads the metadata and generates sql that reads actual rows and creates proper aliases