javaregexmavenmaven-2maven-3

Regex Pattern to parse Maven coordinates


I am attempting to write a Regex pattern to parse Maven co-ordinates from a pom file .

[groupId]:[artifactId]:[type]:[?optional_field]:[version]:[compile]

1. org.eclipse.aether:aether-impl:jar:0.9.0.M2:compile
2. com.google.code.findbugs:annotations:jar:3.0.0:compile

3. org.sonatype.sisu:sisu-guice:jar:no_aop:3.1.0:compile

Above are a few example of maven co-ordinates and note that 1 and 2 have a common pattern but 3 has an additional optional co-ordinate

I need a regex pattern to extract groupId, artifactId and version only

Can anyone suggest an appropriate pattern that would work for all three cases


Solution

  • Maybe instead of using a regex, you could split by : and check the length of the result. If there are 5 items, then there is no optional field. If there are 6 items, then there is an optional field.

    For example:

    String[] strings = {
        "org.eclipse.aether:aether-impl:jar:0.9.0.M2:compile",
        "com.google.code.findbugs:annotations:jar:3.0.0:compile",
        "org.sonatype.sisu:sisu-guice:jar:no_aop:3.1.0:compile"
    };        
    
    for (String string: strings) {
        String[] coll = string.split(":");
        System.out.println("groupId: " + coll[0]);
        System.out.println("artifactId: " + coll[1]);
        if (coll.length == 5) {
            System.out.println("version: " + coll[3]);
        }
        if (coll.length == 6) {
            System.out.println("version: " + coll[4]);
        }            
        System.out.println();          
    }
    

    Output Java example