pythonregexpcrenegative-lookbehind

Regex Negative Lookbehind works in PCRE but not in Python


The pattern (?<!(asp|php|jsp))\?.* works in PCRE, but it doesn't work in Python.

So what can I do to get this regex working in Python? (Python 2.7)


Solution

  • It works perfectly fine for me. Are you maybe using it wrong? Make sure to use re.search instead of re.match:

    >>> import re
    >>> s = 'somestring.asp?1=123'
    >>> re.search(r"(?<!(asp|php|jsp))\?.*", s)
    >>> s = 'somestring.xml?1=123'
    >>> re.search(r"(?<!(asp|php|jsp))\?.*", s)
    <_sre.SRE_Match object at 0x0000000002DCB098>
    

    Which is exactly how your pattern should behave. As glglgl mentioned, you can get the match if you assign that Match object to a variable (say m) and then call m.group(). That yields ?1=123.

    By the way, you can leave out the inner parentheses. This pattern is equivalent:

    (?<!asp|php|jsp)\?.*