regexquantifiersreluctant-quantifiers

What is the use of max m in the lazy quantifiers {n,m}??


In regex, we have greedy and lazy quantifiers. The greedy quantifier {n,m} matches the preceding atom/character/group a minimum of n and a maximum of m occurrences, inclusive.

If I have a collection of strings:

a
aa
aaa
aaaa
aaaaaaaaaa

With a{2,4}, it matches:

That makes sense.

However, if I have a lazy quantifier a{2,4}? I get:

That actually makes sense. It finds the least amount of possible match.

The part that I want to clarify - is there any usefulness to pass any lazy quantifier in the form of {n,m}? a max value m (in this case, the 4 in {2,4}?)? Isn't the result is always the same as {2,}??

Is there a scenario where passing a max (like the 4 in {2,4}?) is useful in lazy quantifier?

Disclaimer: I am actually using the regular expression to search inside Vim (/a{-2,4}), not in any scripting language. I think the principle of the question is still the same.


Solution

  • It matters when you need to consider what follows the lazily quantified expression. Laziness is used to prevent characters from being consumed by a later expression in a concatenation. Consider the string aaaaab:

    1. The string is not matched by a{2,4}?b, as there are too many as for a{2,4} to match.
    2. The string is matched by a{2,}?b, since it can match as many as as necessary.