I have this regex aaa.+?(?>bbb)j
and this input string aaa xxx bbby xxx bbbj
. When I run this regex, it returns aaa xxx bbby xxx bbbj
BUT since I used an atomic group (?>bbb)
the regex should have failed because I read online that when using atomic group the regex interpreter does not backtrack.
So, when finding the first bbb
in the input string it should "stick" with it, then it will check the next letter and it will be y
which is not the intended j
so it should give up and fail. However, for some reason the regex interpreter keeps trying, and eventually finds the last bbbj
. How can I make it fail at the first bbb
it found?
NOTE: I tried using greedy/lazy quantifier but it didnt matter in the case above.
Inside an atomic group backtracking is independent and not related to any other part
of the other parts of the expression.
Therefore bbb
will match independently.
The expression inside an atomic group is a separate expression that maintains its own state
with regard to backtracking. When it comes across the same teritory in the source
it will match exactly the same text every time.
For example (?>bbb?)
will always match bbb
when it comes across it.
Since its an actual separate expression, it can backtrack within itself
to match the most it can.
The only reason (?>bbb)j
did not match bbby
when it came across it is because it
was looking for the j
, however the bbb
was matched. The engine advanced to finf it further along after that.
Note also that assertions are also atomic in nature, with a conditional applied.