pythonregex

Understanding and Fixing the regex?


I have a regex on my input parameter:

r"^(ABC-\d{2,9})|(ABz?-\d{3})$"

Ideally it should not allow parameters with ++ or -- at the end, but it does. Why is the regex not working in this case but works in all other scenarios?

ABC-12 is a valid.
ABC-123456789 is a valid.
AB-123 is a valid.
ABz-123 is a valid.

Solution

  • The problem is that your ^ and $ anchors don't apply to the entire pattern. You match ^ only in the first alternative, and $ only in the second alternative. So if the input matches (ABC-\d{2,9}) at the beginning, the match will succeed even if there's more after this.

    You can put a non-capturing group around everything except the anchors to fix this.

    r"^(?:(ABC-\d{2,9})|(ABz?-\d{3}))$"