javaregex

Pattern java Finding out what part of OR matched


I have the following pattern:

Pattern TAG = Pattern.compile("(<[\\w]+]>)|(</[\\w]+]>)");

Please note the | in the pattern.

And I have a method that does some processing with this pattern

private String format(String s){
    Matcher m = TAG.matcher(s);
    StringBuffer sb = new StringBuffer();

    while(m.find()){
        //This is where I need to find out what part
        //of | (or) matched in the pattern
        // to perform additional processing


    }
    return sb.toString();
}

I would like to perform different functions depending on what part of the OR matched in the regex. I know that I can break up the pattern into 2 different patterns and match on each but that is not the solution I am looking for because my actual regex is much more complex and the functionality I am trying to accomplish would work best if I can do it in a single loop & regex. So my question is that:

Is there a way in java for finding out which part of the OR matched in the regex?

EDIT I am also aware of the m.group() functionality. It does not work for my case. The example below prints out <TAG> and </TAG> So for the first iteration of the loop it matches on <[\\w]+> and second iteration it matches on </[\\w]+>. However I need to know which part matched on each iteration.

static Pattern u = Pattern.compile("<[\\w]+>|</[\\w]+>");

public static void main(String[] args) {
String xml = "<TAG>044453</TAG>";

Matcher m = u.matcher(xml);

while (m.find()) {
    System.out.println(m.group(0));
}
}

Solution

  • Take a look at the group() method on Matcher, you can do something like this:

    if (m.group(1) != null) {
        // The first grouped parenthesized section matched
    }
    else if (m.group(2) != null) {
        // The second grouped parenthesized section matched
    }
    

    EDIT: reverted to original group numbers - the extra parens were not needed. This should work with a pattern like:

    static Pattern TAG = Pattern.compile("(<[\\w]+>)|(</[\\w]+>)");