pythonpython-typingmypypyright

Why does all(isinstance(x, str) for x in value) not help Pyright infer Iterable[str] from object?


I'm working with Pyright in strict mode and want to check if a function parameter value of type object is an Iterable[str]. I tried using:

if isinstance(value, Iterable) and all(isinstance(v, str) for v in value):
    # Pyright complains: 'Type of "v" is unknown'

However, looking at the elements, Pyright complains that the type of v is still Unknown, even after the isinstance check on each element. Why doesn't all(isinstance(...)) refine the type of value to Iterable[str]?

For context: I'm implementing __contains__ in a class that inherits from collections.abc.MutableSequence.
Therefore, the method signature must remain def __contains__(self, value: object) -> bool, I can't change the type annotation of value to Iterable[str].

Here's my current implementation:

def __contains__(self, value: object) -> bool:
    if isinstance(value, Iterable) and all(isinstance(v, str) for v in value):
        value = "".join(value).upper()
        return value in "".join(self.sequence) # self.sequence is an Iterable[str]
    return False

Is there a way to get Pyright to properly infer the type of value here without using an explicit cast() or defining a separate TypeGuard function?
It seems to me that a separate TypeGuard should make no difference.

Alternatively: am I looking at this the wrong way? Should I try to avoid implementing __contains__ myself, because of the object type?


Solution

  • No you cannot do it with pyright without additional structures. Plain all(isinstance(...)) is not a type guard that is supported by pyright. Different to filter can an all type guard not be written in the stubs and needs needs special handling; see some discussion here or here.

    The type checker would not only need to hard code knowledge of all but also make assumptions about the semantics of the iterable expression it is acting upon.

    Support would need to be added specifically for all(isinstance(a, b) for a in [x, y]). This specific expression form is rarely used, so it wouldn't make sense to add the custom logic to support it.

    The official recommendation is to use manual type guards. The following is a nicely reusable function that lets you also specify the type. Alternatively use a cast after the if.

    from typing import Iterable, TypeIs
    
    def is_iterable[T](obj, typ: type[T]=object) -> TypeIs[Iterable[T]]:
        return isinstance(obj, Iterable) and all(isinstance(v, typ) for v in obj)
    
    def __contains__(self, value: object) -> bool:
        if is_iterable(value, str):
            reveal_type(value)  # Iterable[str]
            value = "".join(value).upper()
            return value in "".join(self.sequence) # self.sequence is an Iterable[str]
        reveal_type(value)  # object
        return False
    

    For the strict mode you can add a # pyright: ignore[reportUnknownArgumentType] to not complain about v in the iterator or use these alternative guards:

    def is_iterable_of_type[T](obj: object, typ: type[T]=object) -> TypeIs[Iterable[T]]:
        return is_iterable(obj) and all(isinstance(v, typ) for v in obj)
    
    
    def is_iterable(obj: object) -> TypeIs[Iterable[Any]]:
        return isinstance(obj, Iterable)