javaregexreluctant-quantifiers

Reluctant quantifier acting greedy


I have this code:

String result = text;

    String regex = "((\\(|\\[)(.+)(\\)|\\])){1}?";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(result);

    System.out.println("start");
    System.out.println(result);
    while (matcher.find()) {
        System.out.print("Start index: " + matcher.start());
        System.out.print(" End index: " + matcher.end() + " ");
        System.out.println(matcher.group());
    }
    System.out.println("finish");

And I have a string that I want to match:

Some text sentence or sentences [something 234] (some things)

And the output I get when executing:

start
some text sentence or sentences [something 234] (some things)
Start index: 32 End index: 61 [something 234] (some things)
finish

Now I actually want it to find the found cases in brackets separately, so to find: [something 234] in one match (some things) as the second match

Can anyone please help me build the regex accordingly? I am not sure how to put the reluctant quantifier for the whole regular expression, so I surrounded the whole bracketed elements in another brackets. But I don't understand why this reluctant quantifier is acting greedy here and what do I need to do to change that?


Solution

  • {1} in regex is redundant since any element without specified quantifier needs to be found once. Also making it reluctant doesn't make sense since it doesn't describe range of possible repetitions (like {min,max} where adding ? would tell regex engine to make number of repetitions in that range as close to min as possible). Here {n} describes precise number of repetition so min = max = n.

    Now you should be able to solve your problem by making .+ (content between brackets) reluctant. To do so use .+?.

    So try with:

    String regex = "((\\(|\\[)(.+?)(\\)|\\]))";