I am having am XML like this -
<a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
<c:id>
http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
<c:type>series-item</c:type>
<f:assessment-low>8.946586935</f:assessment-low>
<f:assessment-high>9.946586935</f:assessment-high>
<f:mid>9.44658693500000000000</f:mid>
<f:period-label>
<c:l10n xml:lang="en"/>
</f:period-label>
</a:price-range>
I want to normalise the space in the XML. Like in above example, there are spaces in c:id element. After normalising spaces, above XML will look like -
<a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
<c:id>http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
<c:type>series-item</c:type>
<f:assessment-low>8.946586935</f:assessment-low>
<f:assessment-high>9.946586935</f:assessment-high>
<f:mid>9.44658693500000000000</f:mid>
<f:period-label>
<c:l10n xml:lang="en"/>
</f:period-label>
</a:price-range>
I had a look at fn:normalise-space, but it work for strings only.
This function worked fine for me -
(:
The rules/assumptions are:
#1 Retain one leading space if the node isn't first, has non-space content, and has leading space.
#2 Retain one trailing space if the node isn't last, isn't first, and has trailing space.
#3 Retain one trailing space if the node isn't last, is first, has trailing space, and has non-space content.
#4 Retain a single space if the node is an only child and only has space content.
:)
declare function local:normalize-space-in-xml($input)
{
element {node-name($input)}
{$input/@*,
for $child in $input/node()
return
if ($child instance of element())
then local:normalize-space-in-xml($child)
else
if ($child instance of text())
then
(:#1 Retain one leading space if node isn't first, has non-space content, and has leading space:)
if ($child/position() ne 1 and matches($child,'^\s') and normalize-space($child) ne '')
then (' ', normalize-space($child))
else
(:#4 retain one space, if the node is an only child, and has content but it's all space:)
if ($child/last() eq 1 and string-length($child) ne 0 and normalize-space($child) eq '')
(: this overrules standard normalization:)
then ' '
else
(:#2 if the node isn't last, isn't first, and has trailing space, retain trailing space and collapse and trim the rest:)
if ($child/position() ne 1 and $child/position() ne last() and matches($child,'\s$'))
then (normalize-space($child), ' ')
else
(:#3 if the node isn't last, is first, has trailing space, and has non-space content, then keep trailing space:)
if ($child/position() eq 1 and matches($child,'\s$') and normalize-space($child) ne '')
then (normalize-space($child), ' ')
(:if the node is an only child, and has content which is not all space, then trim and collapse, that is, apply standard normalization:)
else normalize-space($child)
else $child
}
};