pythonpandasdataframeautocorrelation

Calculating autocorrelation for each column of data in Pandas


I want to calculate first order autocorrelation in pandas, for each column of my data.

I expect that each of the following code give the same results, but the result are not the same.

Which one should I use?

df[df.columns.to_list()].apply(lambda x: x.corr(x.shift()))

or

df[df.columns.to_list()].apply(lambda x: x.autocorr)

Solution

  • Second one should be df[df.columns.to_list()].apply(lambda x: x.autocorr()) as you need the inner parentheses to call the autocorr function.

    These snippets should give the exact same result, because the implementation of autocorr is 1 line of code: self.corr(self.shift(lag)) which is the same as your first snippet.

    Please share your data with a reproducible example if this still isn't working.

    As a secondary note- using df[df.columns.to_list()] isn't doing anything special since you're not creating a subset of the data so you can simply just do df.apply and skip df[df.columns.to_list()]