pythonregex

Regex a string with a space between words


import re

texto = "ABC ABC. ABC.png ABC thumb.png"

regex = r"ABC(?!.png)|ABC(?! thumb.png)"

novo = re.sub(regex, "bueno", texto)

print(novo)

I'm trying to replace the ABC word with exceptions. I only want to replace it if it doesn't follow the word ".png" or " thumb.png". The string would be then "ABC thumb.png"

I expected

bueno bueno. ABC.png ABC thumb.png

But the output is this

bueno bueno. bueno.png bueno thumb.png

It isn't detecting the space and it actually messes up the first condition.


Solution

  • Starting with your original pattern:

    ABC(?!\.png)|ABC(?! thumb\.png)
    (Note: Dot is a regex metacharacter and should be escaped with backslash)
    

    This will match ABC which is not followed by .png or ABC not followed by thumb.png. Every possible occurrence of ABC will match this pattern. Therefore, all occurrences of ABC will be match, because every extension will match at least one of the two conditions.

    We can write the following correction:

    \bABC(?!\.png| thumb\.png)
    

    This pattern says to match:

    The negative lookahead used here basically has AND flavored logic, and will exclude both following extensions.