pythonpandasnumpypickle

Missing 'numpy._core.numeric' when loading pandas df from pickle


A pandas dataframe was pickled on another machine via

df.to_pickle('data.pkl')

I tried loading that pickle both through

with open('path/to/data.pkl', 'rb') as handle:
    data = pickle.load(handle)

and

data = pd.read_pickle('path/to/data.pkl')

both yielded

ModuleNotFoundError: No module named 'numpy._core.numeric'

Both computers run pandas 2.2.2 and numpy 1.26.4, going to these versions worked for some others that faced this error, not me.


Solution

  • Turns out that how you pickle makes a difference. If the dataframe was pickled through a pickle dump, it works:

    with open('test.pkl','wb') as handle:
        pickle.dump(df,handle)
    

    On second computer, in different file:

    with open('path/to/test.pkl','rb') as handle:
        data = pickle.load(handle)
    # or:
    data = pd.read_pickle('path/to/test.pkl')
    

    However: If you select pickle.HIGHEST_PROTOCOL as the protocol for dumping, the same error will arise no matter how the pickle is loaded

    # Gave pickle with same missing module error on loading:
    with open('test.pkl','wb') as handle:
        pickle.dump(df,handle,protocol=pickle.HIGHEST_PROTOCOL)
    

    I have not done any investigation on why the error appears and came to this solution simply through trial and error.