pythonpandasdataframeshuffle

Undocumented pandas DataFrame shuffle()


The following seems to work:

import pandas as pd
import sklearn
df = sklearn.datasets.load_iris()
df = pd.DataFrame(df.data, columns=df.feature_names)
df.shuffle()

However this shuffle function seems not to be a documented DataFrame function?

Is this an internal function we are not supposed to use?


Solution

  • If it "works" in some environment, it's likely due to a custom extension or monkey patching.
    In Pandas 2.3.1, the correct way to shuffle a DataFrame is:
    df = df.sample(frac=1).reset_index(drop=True)
    df.shuffle() is not part of the standard Pandas API.