pythonpython-typingmypypython-re

Static typing of Python regular expression: 'incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" '


Just to be clear, this question has nothing to do with the regular expression itself and my code is perfectly running even though it is not passing mypy strict verification.

Let's start from the basic, I have a class defined as follows:

from __future__ import annotations

import re
from typing import AnyStr


class MyClass:
    def __init__(self, regexp: AnyStr | re.Pattern[AnyStr]) -> None:
        if not isinstance(regexp, re.Pattern):
            regexp = re.compile(regexp)
        self._regexp: re.Pattern[str] | re.Pattern[bytes]= regexp

The user can build the class either passing a compiled re pattern or AnyStr. I want the class to store in the private _regexp attribute the compiled value. So I check if the user does not provided a compiled pattern, then I compile it and assign it to the private attribute.

So far so good, even though I would have expected self._regexp to be type re.Pattern[AnyStr] instead of the union of the type pattern types. Anyhow, up to here everything is ok with mypy.

Now, in some (or most) cases, the user provides the regexp string via a configuration TOML file, that is read in, parsed in a dictionary. For this case I have a class method constructor defined as follow:

    @classmethod
    def from_dict(cls, d: dict[str, str]) -> MyClass:
        r = d.get('regexp')
        if r is None:
            raise KeyError('missing regexp')
        return cls(regexp=r)

The type of dictionary will be dict[str, str]. I have to check that the dictionary contains the right key to prevent a NoneType in case the get function cannot find it.

I get the error:

error: Argument "regexp" to "MyClass" has incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" [arg-type]

That looks bizarre, because str should be compatible with AnyStr.

Let's say that I modify the dictionary typing to dict[str, AnyStr]. Instead of fixing the problem, it multiplies it because I get two errors:

error: Argument "regexp" to "MyClass" has incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" [arg-type]
error: Argument "regexp" to "MyClass" has incompatible type "bytes"; expected "AnyStr | Pattern[AnyStr]" [arg-type]

It looks like I am in a loop: when I think I have fixed something, I just moved the problem back elsewhere.


Solution

  • AnyStr is a type variable, and type variables should either appear 2+ times in a function signature or 1+ time in the signature and 1 time in the enclosing class as a type variable. If you have neither of these situations, you'd be better off to use a union. See mypy Playground

    ā€” comment by dROOOze