pythonpandaspandas-rolling

How to get correlation from fixed series and rolling dataframe


Can someone help me? I want to run correlation with fixed Series in df and rolling DataFrame in df1. Desired result should be all 1 from index 1 to 6 but I got NaN.

df = pd.Series([2,3,4])
df1 = pd.DataFrame({'A':[1,2,3,4,5,6,7],
                   'B':[10,20,30,40,50,60,70],
                   'C':[100,200,300,400,500,600,700]})

def x(y):
    return y.corr(df)

df2 = df1.rolling(window=len(df),min_periods=1).apply(x)

[out] =

    A   B   C
0   NaN NaN NaN
1   1.0 1.0 1.0
2   1.0 1.0 1.0
3   1.0 1.0 1.0
4   NaN NaN NaN
5   NaN NaN NaN
6   NaN NaN NaN

Desired result:

    A   B   C
0   NaN NaN NaN
1   1.0 1.0 1.0
2   1.0 1.0 1.0
3   1.0 1.0 1.0
4   1.0 1.0 1.0
5   1.0 1.0 1.0
6   1.0 1.0 1.0

Solution

  • Create same index in y and df, here is default index in df, so is necessary Series.reset_index with drop=True:

    def x(y):
        return y.reset_index(drop=True).corr(df)
    
    df2 = df1.rolling(window=len(df),min_periods=1).apply(x)
    print (df2)
         A    B    C
    0  NaN  NaN  NaN
    1  1.0  1.0  1.0
    2  1.0  1.0  1.0
    3  1.0  1.0  1.0
    4  1.0  1.0  1.0
    5  1.0  1.0  1.0
    6  1.0  1.0  1.0