In VScode, it seems that Intellisense is not able to infer the return type of calls to pandas.DataFrame.pipe
. It is a source of some inconvenience as I cannot rely on autocompletion after using pipe. But I haven't seen this issue mentioned anywhere, so it makes me wonder if it's just me or if I am missing something.
This is what I do:
import pandas as pd
df = pd.DataFrame({'A': [1,2,3]})
df2 = df.pipe(lambda x: x + 1)
VSCode recognizes df
as a DataFrame: , but has no clue what df2
might be:
A first thought would be that this is due to the lack of type hinting in the lambda function. But if I try this instead:
def add_one(df: pd.DataFrame) -> pd.DataFrame:
return df + 1
df3 = df.pipe(add_one)
Still IntelliSense can't guess the type of df3
:
Of course as a last recourse I can add a hint to df3
itself:
df3: pd.DataFrame = df.pipe(add_one)
But it seems like it shouldn't be necessary. IntelliSense seems very capable of inferring return types in other complex scenarios, such as involving map
:
UPDATE:
I experimented a bit more and found some interesting patterns which narrow down the range of possible causes.
I am not sufficiently familiar with Pylance to really understand why this is happening, but here is what I find:
Finding 1
It is happening to pandas.core.common.pipe if import it. (I know pd.DataFrame.pipe calls pandas.core.generic.pipe, but that internally calls pandas.core.common.pipe, and I can reproduce the issue in pandas.core.common.pipe.)
Finding 2
If I copy the definition of that same function from pandas.core.common, together with the relevant imports of Callable and TypeVar, and declare T
as TypeVar('T')
, IntelliSense actually does its magic.
(Actually in pandas.core.common, T
is not defined as TypeVar('T')
but imported from pandas._typing, where it is defined as TypeVar('T')
. If I import it instead of defining it myself, it still works fine.)
From this I am tempted to conclude that pandas does everything right, but that Pylance is failing to keep track of type information for some unknown reason...
Finding 3
If I just copy pandas.core.common into a local file pandascommon.py and import pipe from that, it works fine too!
I got it!
It was due to the stubs shipped with Pylance. Specifically in ~/.vscode/extensions/ms-python.vscode-pylance-2022.3.2/dist/bundled/stubs/pandas/
.
For example in core/common.pyi I found this stub:
def pipe(obj, func, *args, **kwargs): ...
Pylance uses this instead of the annotations in pandas.core.common.pipe
, causing the issue.
One heavy-handed solution is to just erase (or rename) the pandas stubs in that folder. Then pipe works again. On the other hand, it breaks some other things, for example read_csv is no longer correctly inferred to return a DataFrame. I think the better long run solution would be for the Pylance maintainers to improve those stubs...
A minimally invasive solution to the original pipe
issue is to edit ~/.vscode/extensions/ms-python.vscode-pylance-2022.3.2/dist/bundled/stubs/pandas/core/frame.pyi
in the following manner:
add from pandas._typing import T
replace the line starting with def pipe
by:
def pipe(self, func: Callable[..., T], *args, **kwargs) -> T: ...