[SOLVED] Check that both `lookbehind` conditions are satisfied in `RegEx`

Check that both `lookbehind` conditions are satisfied in `RegEx`

I'm trying to check if a username is preceded either by RT @ or by RT@ by using lookbehind mechanism paired with conditionals, as explained in this tutorial. The regex and the example are shown in Example 1:

Example 1

import re

text = 'RT @u1, @u2, u3, @u4, rt @u5:, @u3.@u1^, rt@u3'

mt_regex = r'(?i)(?<!RT )&(?<!RT)@(\w+)'

mt_pat = re.compile(mt_regex)

re.findall(mt_pat, text)

which outputs [] (empty list), while the desired output should be:

['u2', 'u4', 'u3', 'u1']

What am I missing? Thanks in advance.

Solution

If we break down your regex:

r"(?i)(?<!RT )&(?<!RT)@(\w+)"
(?i)        match the remainder of the pattern, case insensitive match
(?<!RT )    negative lookbehind
            asserts that 'RT ' does not match
&           matches the character '&' literally
(?<!RT)     negative lookbehind 
            asserts that 'RT' does not match
@           matches the character '@' literally
(\w+)       Capturing Group    
            matches [a-zA-Z0-9_] between one and unlimited times

You have the & character that is preventing your regex matching:

import re

text = "RT @u1, @u2, u3, @u4, rt @u5:, @u3.@u1^, rt@u3"
mt_regex = r"(?i)(?<!RT )(?<!RT)@(\w+)"
mt_pat = re.compile(mt_regex)

print(re.findall(mt_pat, text))
# ['u2', 'u4', 'u3', 'u1']

See this regex here