I'm trying to use regexes to match space-separated numbers.
I can't find a precise definition of \b
("word boundary").
I had assumed that -12
would be an "integer word" (matched by \b\-?\d+\b
) but it appears that this does not work. I'd be grateful to know of ways of .
[I am using Java regexes in Java 1.6]
Example:
Pattern pattern = Pattern.compile("\\s*\\b\\-?\\d+\\s*");
String plus = " 12 ";
System.out.println("" + pattern.matcher(plus).matches());
String minus = " -12 ";
System.out.println("" + pattern.matcher(minus).matches());
pattern = Pattern.compile("\\s*\\-?\\d+\\s*");
System.out.println("" + pattern.matcher(minus).matches());
This returns:
true
false
true
A word boundary, in most regex dialects, is a position between \w
and \W
(non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ([0-9A-Za-z_]
).
So, in the string "-12"
, it would match before the 1 or after the 2. The dash is not a word character.