javaregexreluctant-quantifiers

java regexp for reluctant matching


need to find an expression for the following problem:

String given = "{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"answer 4\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"answer 5\"}";

What I want to get: "{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"*******\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"******\"}";

What I am trying:

    String regex = "(.*answer\"\\s:\"){1}(.*)(\"[\\s}]?)";
    String rep = "$1*****$3";
    System.out.println(test.replaceAll(regex, rep));

What I am getting:

"{ \"questionID\" :\"4\", \"question\":\"What is your favourite hobby?\",\"answer\" :\"answer 4\"},{ \"questionID\" :\"5\", \"question\" :\"What was the name of the first company you worked at?\",\"answer\" :\"******\"}";

Because of the greedy behaviour, the first group catches both "answer" parts, whereas I want it to stop after finding enough, perform replacement, and then keep looking further.


Solution

  • The pattern

    ("answer"\s*:\s*")(.*?)(")
    

    Seems to do what you want. Here's the escaped version for Java:

    (\"answer\"\\s*:\\s*\")(.*?)(\")
    

    The key here is to use (.*?) to match the answer and not (.*). The latter matches as many characters as possible, the former will stop as soon as possible.

    The above pattern won't work if there are double quotes in the answer. Here's a more complex version that will allow them:

    ("answer"\s*:\s*")((.*?)[^\\])?(")

    You'll have to use $4 instead of $3 in the replacement pattern.