A pandas dataframe was pickled on another machine via
df.to_pickle('data.pkl')
I tried loading that pickle both through
with open('path/to/data.pkl', 'rb') as handle:
data = pickle.load(handle)
and
data = pd.read_pickle('path/to/data.pkl')
both yielded
ModuleNotFoundError: No module named 'numpy._core.numeric'
Both computers run pandas 2.2.2 and numpy 1.26.4, going to these versions worked for some others that faced this error, not me.
Turns out that how you pickle makes a difference. If the dataframe was pickled through a pickle dump, it works:
with open('test.pkl','wb') as handle:
pickle.dump(df,handle)
On second computer, in different file:
with open('path/to/test.pkl','rb') as handle:
data = pickle.load(handle)
# or:
data = pd.read_pickle('path/to/test.pkl')
However: If you select pickle.HIGHEST_PROTOCOL
as the protocol for dumping, the same error will arise no matter how the pickle is loaded
# Gave pickle with same missing module error on loading:
with open('test.pkl','wb') as handle:
pickle.dump(df,handle,protocol=pickle.HIGHEST_PROTOCOL)
I have not done any investigation on why the error appears and came to this solution simply through trial and error.