Hello I'd like handle Xml-Files which have encoded node names like for example:
<CST_x002F_SOMETHING>
....
</CST_x002F_SOMETHING>
This node name should be decoded to CST/SOMETHING
.
These node names were encoded for example via EncodeName. Is there any built-in XQuery-function to decode these names? Or do you have an encoding / decoding function?
XML Files produced by Oracle-DB use the same escaping mechanism.
Use fn:analyze-string()
to split the string and match the _XXXX_
parts. When you encounter one of these parts, use bin:hex()
to convert hex to binary, then bin:unpack-unsigned-integer()
to convert the binary to an integer, then fn:codepoints-to-string()
to convert the integer codepoint to a string.
The binary functions are documented at https://www.saxonica.com/documentation/index.html#!functions/expath-binary
Requires Saxon-PE or higher.
You could also use the new saxon:replace-with() function:
declare namespace bin = 'http://expath.org/ns/binary';
saxon:replace-with('CST_x002F_SOMETHING', '_x[0-9A-F]{4}_',
function($s) {$s => substring(3, 4)
=> bin:hex()
=> bin:unpack-unsigned-integer(0,2)
=> codepoints-to-string()}
outputs CST/SOMETHING