javaregexregex-groupmatching

regex to match absent groups at fixed locations


A regex is needed that always matches (with matches(), not find()) and always recognizes 3 groups, for 3 different cases of input, like

  1. 1234 ab$.5!c=:d6 efg(789)
  2. 1234 efg(567)
  3. efg(567)

The pattern

(?:^(\d+)\s+(\S)\s)?\s*([^\(]+\(\S+\))

represents the kind of values expected in each group (without assumptions about the location of characters), but only works correctly in case #1, producing

1234, ab$.5!c:d6, efg(789)

For cases 2 and 3, the same pattern does not work, giving, respectively

null, null, ab$.5!c:d6 efg(789)
null, null, efg(789)

Any ideas?


Solution

  • You could use the below regex.

    ^(?:(\d+)\s+(?:(\S+)\s)?)?([^(]+\([^)]*\))$
    

    DEMO

    String s = "1234 efg(567)";
    Matcher m = Pattern.compile("^(?:(\\d+)\\s+(?:(\\S+)\\s)?)?([^(]+\\([^)]*\\))$").matcher(s);
    while(m.find()) {
        if(m.group(1) != null)
            System.out.println(m.group(1));
        if(m.group(2) != null)
            System.out.println(m.group(2));
        if(m.group(3) != null)
            System.out.println(m.group(3));
    }