javascriptregexforward-reference

Forward reference in regex


What is the difference of the following regular expressions?

(\2amigo|(go!))+
(amigo|(go!))+

They both match the same strings. https://regexr.com/3u62t

How does the forward reference work?


Solution

  • It doesn't actually work at all (though as Wiktor Stribiżew pointed out, it could with other regex flavours).

    When \n refers to a capture group that has not captured anything, it matches the empty string. You can see this in e.g. /(a)?b\1/, which matches b.

    When \n refers to a capture group that appears later in the pattern, it ordinarily cannot have captured anything yet. You can see this in e.g. /\1b(a)/, which matches ba.

    You might think that within a repetition, the previous captures are persisted, so that /(\2a(b))*/ would match abbab, but that's not how it works: within a repetition, when a new match starts, the captures are reset. So instead it matches abab, not abbab.

    As a result, a forward reference is completely and utterly useless and only ever matches an empty string. There is no difference between your two patterns.