I am wondering what the regex for a word would be, I can seem to find it anywhere? The string I\m trying to match "Loop-num + 5" and I want to extract the "Loop-num" part. I am unsure what the regex would be to do so.
Pattern pattern = Pattern.compile("(loop-.*)");
Matcher matcher = pattern.matcher("5 * loop-num + 5");
if(matcher.find()){
String extractedString = matcher.group(1);
System.out.println(extractedString);
}
From this I get: "loop-num + 5"
If you really plan to use the regex to match words (entities comprising just letters, optionally split with hyphen(s)), you need to consider the following regex:
\b\pL+(?:-\pL+)*\b
See regex demo
Explanation:
\b
- leading word boundary\pL+
- 1 or more Unicode letters(?:-\pL+)*
- zero or more sequences of...
-
- a literal hyphen\pL+
- 1 or more Unicode letters\b
- trailing word boundaryIn Java:
Pattern pattern = Pattern.compile("\\b\\pL+(?:-\\pL+)*\\b", Pattern.UNICODE_CHARACTER_CLASS);
Matcher matcher = pattern.matcher("5 * loop-num + 5");
if(matcher.find()){
String extractedString = matcher.group(0);
System.out.println(extractedString);
}
Note: in case words may include digits (not at the starting positions), you can use \b\pL\w*(?:-\pL\w*)*\b
with Pattern.UNICODE_CHARACTER_CLASS
. Here, \w
will match letters, digits and an underscore.