I have an issue with matching some of punctuation characters when Pattern.UNICODE_CHARACTER_CLASS flag is enabled.
For sample code is as follows:
final Pattern p = Pattern.compile("\\p{Punct}",Pattern.UNICODE_CHARACTER_CLASS);
final Matcher matcher = p.matcher("+");
System.out.println(matcher.find());
The output is false, although it is explicitly stated in documentation that p{Punct} includes characters such as !"#$%&'()*+,-./:;<=>?@[]^_`{|}~
Apart from '+' sign, the same problem occurs for following characters $+<=>^`|~
When Pattern.UNICODE_CHARACTER_CLASS is removed, it works fine
I will appreciate any hints on that problem
The javadoc states what comes under //p{punc} with the caveat that
POSIX character classes (US-ASCII only)
If you take a look at the punctuation chars in unicode there is no + or $. Take a look at the punctuation chars in unicode at http://www.fileformat.info/info/unicode/category/Po/list.htm .