[SOLVED] Pandas - calculate rolling standard deviation over all columns

Pandas - calculate rolling standard deviation over all columns

Say I have a pd.DataFrame and want to calculate the rolling Standard deviation. In pandas I can use rolling(window=x).std(), but it gives me the SD by column. I however want the standard deviation over all columns in a given row.

As an example consider the pd dataframe

df = pd.DataFrame({'col1': [1,2,3,4,5,6], 'col2': [-1,-2,-3,-4,-5,-6], 'col3': [1,2,3,4,5,6]})
df
   col1  col2  col3
0     1    -1     1
1     2    -2     2
2     3    -3     3
3     4    -4     4
4     5    -5     5
5     6    -6     6

When calculating for a window size of 2, for instance, I would like to have the standard deviation in row 2 as the sum of the two rows divided by 6 (or 6-1), doesn't matter), so: np.std([2,-2,2,1,-1,1]).

I tried to calculate it on a melted dataframe, but I didn't get the result as expected:

df.reset_index().melt(id_vars='index').set_index('index')['value'].rolling(2).std()

Does anyone have an idea how to do it? I appreciate your feedback.

Solution

n = len(df.columns)
window = 2
out = df.stack().rolling(window * n).std().xs(df.columns[-1], level=-1)

out:

0         NaN
1    1.643168
2    2.639444
3    3.656045
4    4.679744
5    5.706721
dtype: float64

out is only std.

you said divide by 6 or 5, so i dont divide anything for you to choose.