I am looking to transform data from XML to CSV. Below is the XML data.
Note that one of the element itself contains commas (COMPONENT_DESCRIPTION
). P_OUT_PUT_S3_DATA_ITEM
is a repeating element.
<P_OUT_PUT_S3_DATA>
<P_OUT_PUT_S3_DATA_ITEM>
<RECEIPT_DATE>02-APR-21</RECEIPT_DATE>
<AWS_ASSET_ID>12345</AWS_ASSET_ID>
<RACK_PO_NUMBER>US12345</RACK_PO_NUMBER>
<VENDOR_NAME>Foxconn</VENDOR_NAME>
<MANUFACTURING_SITE>FOX-SLC</MANUFACTURING_SITE>
<COMPONENT_CATEGORY>CPU</COMPONENT_CATEGORY>
<COMPONENT_DESCRIPTION>CPU, Intel, Cascade Lake, 8259CL, 24C, 210W</COMPONENT_DESCRIPTION>
<APN>1</APN>
<COMP_VOL>76</COMP_VOL>
</P_OUT_PUT_S3_DATA_ITEM>
<P_OUT_PUT_S3_DATA_ITEM>
<RECEIPT_DATE>02-APR-21</RECEIPT_DATE>
<AWS_ASSET_ID>23456</AWS_ASSET_ID>
<RACK_PO_NUMBER>US23456</RACK_PO_NUMBER>
<VENDOR_NAME>Foxconn</VENDOR_NAME>
<MANUFACTURING_SITE>FOX-SLC</MANUFACTURING_SITE>
<COMPONENT_CATEGORY>CPU</COMPONENT_CATEGORY>
<COMPONENT_DESCRIPTION>CPU, Intel, Cascade Lake, 8259CL, 24C, 210W</COMPONENT_DESCRIPTION>
<APN>1</APN>
<COMP_VOL>76</COMP_VOL>
</P_OUT_PUT_S3_DATA_ITEM>
</P_OUT_PUT_S3_DATA>
Desired Output
RECEIPT_DATE,AWS_ASSET_ID,RACK_PO_NUMBER,VENDOR_NAME,MANUFACTURING_SITE,COMPONENT_CATEGORY,COMPONENT_DESCRIPTION,APN,COMP_VOL
02-APR-21,12345,US12345,Foxconn,FOX-SLC,CPU,"CPU, Intel, Cascade Lake, 8259CL, 24C, 210W",1,76
02-APR-21,23456,US23456,Foxconn,FOX-SLC,CPU,"CPU, Intel, Cascade Lake, 8259CL, 24C, 210W",1,76
XSLT-
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:sequence select=
"for $b in /*/P_OUT_PUT_S3_DATA_ITEM
return
concat(escape-html-uri(string-join(($b/RECEIPT_DATE,
$b/AWS_ASSET_ID,
$b/RACK_PO_NUMBER,
$b/VENDOR_NAME,
$b/MANUFACTURING_SITE,
$b/COMPONENT_CATEGORY,
$b/COMPONENT_DESCRIPTION,
$b/APN,
$b/COMP_VOL
)
/normalize-space(),
',')
),
codepoints-to-string(10))"/>
</xsl:template>
</xsl:stylesheet>
If (as it seems from your stylesheet) you are able to use XSLT 2.0, you could do something simple like:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8" />
<xsl:template match="/P_OUT_PUT_S3_DATA">
<!-- header -->
<xsl:value-of select="P_OUT_PUT_S3_DATA_ITEM[1]/*/name()" separator=","/>
<xsl:text> </xsl:text>
<!-- data -->
<xsl:for-each select="P_OUT_PUT_S3_DATA_ITEM">
<xsl:value-of select="for $i in * return concat('"', replace($i, '"', '""'), '"')" separator=","/>
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
For XSLT 1.0 version see: http://stackoverflow.com/a/25003101/3016153