I've written this code:
from typing import overload, TYPE_CHECKING, Protocol, Any
import pyarrow as pa # type: ignore[import-not-found]
class PyArrowArray(Protocol):
@property
def buffers(self) -> Any: ...
@overload
def func(a: PyArrowArray) -> int: ...
@overload
def func(a: str) -> str: ...
@overload
def func(a: Any) -> str | int: ...
def func(a) -> str | int:
if isinstance(a, pa.Array):
return 0
return '0'
reveal_type(func(pa.array([1,2,3])))
PyArrow is a Python library which does not have type hints. However, there is a package pyarrow-stubs
which provides types for it.
I have a function can accept either a pyarrow.Array
or a str
:
pyarrow.Array
, it returns an int
str
, it returns a str
I would like to annotate it such that:
pyarrow-stubs
installed, then func(pa.array([1,2,3]))
is revealed to be int
pyarrow-stubs
installed, then func(pa.array([1,2,3]))
should be revealed to be int | str
, because pa
is not known staticallyI was hoping that the code above would accomplish that, but it doesn't. If pyarrow-stubs is not installed, I get
Revealed type is "Any"
I was expecting that the
@overload
def func(a: Any) -> str | int: ...
overload would be matched and that I'd get Revealed type is int | str
I am confident to say that this is very likely not possible. The Any
Type per PEP 484 matches everything, hence an Any
input will match all overloads. What happens in such cases is defined here in the Mypy docs:
[...] if multiple variants match due to an argument being of type Any, mypy will make the inferred type also be Any:
pyright docs also describes it similarly, interestingly with pyright you can choose one, but not both overloads and avoid Unknown
, this is explained here.
# NOTE: pyright only
@overload
def func(a: PyArrowArray) -> int: ...
@overload
def func(a: Any) -> str | int: ...
def func(a):
if isinstance(a, pa.Array):
return 0
return "0"
bla = cast(PyArrowArray, ...)
reveal_type(func(bla)) # int
reveal_type(func("fooo")) # str | int :(
reveal_type(func(pa.array([1, 2, 3]))) # str | int # Not Unknown
The only solution I somewhat see is that in turn pa.array
is not allowed to be Any
. Currently I see no good way to satisfy that without destroying compatibility when stubs are present. Best make it a requirement. Or somehow make everything a no-op class that is secondary to the stubs.
I assume you are looking for a mypy solution, with pyright
you can solve it like this:
from typing import overload, TYPE_CHECKING, Callable, Protocol, Any, reveal_type, T
import pyarrow as pa
if TYPE_CHECKING:
if pa.array is None: # path only taken when stubs not present
class unknown:
def __call__(self, *args, **kwargs) -> unknown:
...
pa.array = unknown()
# ... your original code
# type of pa.array without stubs will be: array: unknown | Unknown
reveal_type(func(bla)) # int
reveal_type(func("fooo")) # str
reveal_type(func(pa.array([1, 2, 3]))) # int | str