I have a simple XSLT3.0 that should only replace characters that are not ascii (and the € sign) with a .
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:output method="xml" indent="no"/>
<!-- Template to match all nodes except text nodes -->
<xsl:template match="node()[not(self::text())]">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- Template to match text nodes and apply regex -->
<xsl:template match="text()">
<xsl:value-of select="replace(., '[^\\x00-\\xFF€]', '.')"/>
</xsl:template>
</xsl:stylesheet>
But for some reason it matches everything and my test case:
<fieldName>generic currency ¤ euro € other characters देवनागरी German ä, ö, ü, and ß russian a, y, o, ы, э, я, ю, ё, и</fieldName>
it becames:
<fieldName>........................€...........................G......................................................</fieldName>
the expected output should be:
<fieldName>generic currency ¤ euro € other characters ........ German ä, ö, ü, and ß russian a, y, o, ., ., ., ., ., .</fieldName>
Several problems here:
<xsl:value-of select="replace(., '[^\\x00-\\xFF€]', '.')"/>
Firstly, there is no need to double your backslashes in XPath regular expressions. Because you have doubled it, \\
is just an escaped backslash, and the other characters such as x
and 0
and F
just represent themselves.
Secondly, the construct \xHH
is not recognized in the XPath regex dialect. Write 	-ÿ
(hex 09, TAB, is the first codepoint allowed in XML 1.0)
Thirdly, you mentioned ASCII, but ASCII stops at 127, not at 255.