regexregex-negationregex-lookarounds

Regex to match strings containing two of any character but not three


I want a Regex to match strings containing the same character twice (not necessarily consecutive) but not if that character appears three times or more.

For example, given these two inputs:

abcbde
abcbdb

The first, abcbde would match because it contains b twice. However, abcbdb contains b three times, so that would not match.

I have created this Regex, however it matches both:

(\w).*\1{1}

I've also tried to use the ? modifier, however that still matches abcbdb, which I don't want it to.


Solution

  • You need two checks: a first check to ensure no character exists 3 times in the input, and a second check to look for one that exists 2 times:

    ^(?!.*(\w).*\1.*\1).*?(\w).*\2
    

    This is horribly inefficient compared to, say, using your programming language to construct an array of character frequencies, requiring only 1 pass through the entire input. But it works.