regexgroovygstring

Groovy regex PatternSyntaxException when parsing GString-style variables


Groovy here. I'm being given a String with GString-style variables in it like:

String target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'

Keep in mind, this is not intended to be used as an actual GString!!! That is, I'm not going to have 3 string variables (animal, role and bodyPart, respectively) that Groovy will be resolving at runtime. Instead, I'm looking to do 2 distinct things to these "target" strings:

My best attempt thus far:

class TargetStringUtils {
    private static final String VARIABLE_PATTERN = "\${*}"

    // Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
    // Example desired output: 'How now brown ?. The ? has oddly-shaped ?.'
    static String replaceVarsWithQuestionMarks(String target) {
        target.replaceAll(VARIABLE_PATTERN, '?')
    }

    // Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
    // Example desired output: [animal,role,bodyPart]    } list of strings  
    static List<String> collectVariableRefs(String target) {
        target.findAll(VARIABLE_PATTERN)
    }
}

...produces PatternSytaxException anytime I go to run either method:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${*}
^

Any ideas where I'm going awry?


Solution

  • The issue is that you have not escaped the pattern properly, and findAll will only collect all matches, while you need to capture a subpattern inside the {}.

    Use

    def target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
    println target.replaceAll(/\$\{([^{}]*)\}/, '?') // => How now brown ?. The ? has oddly-shaped ?.
    
    def lst = new ArrayList<>();
    def m = target =~ /\$\{([^{}]*)\}/
    (0..<m.count).each { lst.add(m[it][1]) }
    println lst   // => [animal, role, bodyPart]
    

    See this Groovy demo

    Inside a /\$\{([^{}]*)\}/ slashy string, you can use single backslashes to escape the special regex metacharacters, and the whole regex pattern looks cleaner.