javaregex

Multiple repeated groups pattern matching in Java


Here is the question and required solution:
1.First Case:

String str = "Variable_1 in the range 0...4";

Solution: var1 = Variable_1 Range = 0...4

 Pattern p1 = Pattern.compile("(.*[^.]) in the range of (.*[^.])$");
    Matcher m1 = p1.matcher(desc);

    if (m1.find()) {
        System.out.println(m1.group(1));
        System.out.println(m1.group(2));
    }

2.Second Case:

String str = "Variable_1 in the range 0...4 Variable_2 in the range 10...40";

Solution: var1 = Variable_1 range1 = 0...4 var2 = Variable_2 range2 = 10...40

3.Third Case:

String str = "Variable_1 in the range 0...4 Variable_2 in the range 10...40 Variable_3 in the range 10...50";

Solution: var1 = Variable_1 range1 = 0...4 var2 = Variable_2 range2 = 10...40 var3 = Variable_3 range3 = 10...50

The first case works fine with the regex. I need to extend the same regex for the second and third cases. It should also be able to handle for n number of cases.


Solution

  • Assuming the of in your pattern is redundant, you may use

    (\w+) in the range (\d+\.+\d+)
    

    Or, if your strings contain of, then add it, (\w+) in the range of (\d+\.+\d+). \w+ will match one or more letters, digits or underscores. \d+\.+\d+ matches 1+ digits, 1+ dots, 1+ digits.

    See the regex demo

    Java demo:

    String lines[]  = {"Variable_1 in the range 0...4",
        "Variable_1 in the range 0...4 Variable_2 in the range 10...40",
        "Variable_1 in the range 0...4 Variable_2 in the range 10...40 Variable_3 in the range 10...50"
    };
    Pattern p = Pattern.compile("(\\w+) in the range (\\d+\\.+\\d+)");
    for(String line : lines)
    {
        System.out.println(line);
        Matcher m = p.matcher(line);
        List<String> res = new ArrayList<>();
        while(m.find()) {
            System.out.println(m.group(1));
            System.out.println(m.group(2));
        }
    }
    

    Output:

    Variable_1 in the range 0...4
    Variable_1
    0...4
    Variable_1 in the range 0...4 Variable_2 in the range 10...40
    Variable_1
    0...4
    Variable_2
    10...40
    Variable_1 in the range 0...4 Variable_2 in the range 10...40 Variable_3 in the range 10...50
    Variable_1
    0...4
    Variable_2
    10...40
    Variable_3
    10...50