xpathxsltxpath-2.0

How do I "regex-quote" a string in XPath


I have an XSL template that takes 2 parameters (text and separator) and calls tokenize($text, $separator). The problem is, the separator is supposed to be a regex. I have no idea what I get passed as separator string.

In Java I would call Pattern.quote(separator) to get a pattern that matches the exact string, no matter what weird characters it might contain, but I could not find anything like that in XPath. I could iterate through the string and escape any character that I recognize as a special character with regard to regex. I could build an iteration with substring-before and do the tokenize that way. I was wondering if there is an easier, more straightforward way?


Solution

  • You could escape your separator tokens using the XPath replace function to find any character that requires escaping, and precede each occurrence with a \. Then you could pass such an escaped token to the XPath tokenize function.

    Alternatively you could just implement your own tokenize stylesheet function, and use the substring-before and substring-after functions to extract substrings, and recursion to process the full string.