[SOLVED] Pandas dataframe rolling sum column with groupby

Pandas dataframe rolling sum column with groupby

I'm trying to create a new column that gives a rolling sum of values in the Values column. The rolling sum includes 4 rows i.e. the current row and the next three rows. I want to do this for each type in the 'Type' column.

However, if there are fewer than 4 rows before the next type starts, I want the rolling sum to use only the remaining rows. For example, if there are 2 rows after the current row for the current type, a total of 3 rows is used for the rolling sum. See the table below showing what I'm currently getting and what I expect.

Index	Type	Value	Current Rolling Sum	Expected Rolling Sum
1	left	5	22	22
2	left	9	34	34
3	left	0	NaN	25
4	left	8	NaN	25
5	left	17	NaN	17
6	straight	7	61	61
7	straight	4	77	77
8	straight	0	86	86
9	straight	50	97	97
10	straight	23	NaN	47
11	straight	13	NaN	24
12	straight	11	NaN	11

The following line of code is what I'm currently using to get the rolling sum.

rolling_sum = df.groupby('Type', sort=False)['Value'].rolling(4, min_periods = 3).sum().shift(-3).reset_index()
rolling_sum = rolling_sum.rename(columns={'Value': 'Rolling Sum'})

extracted_col = rolling_sum['Rolling Sum']
df = df.join(extracted_col)

I would really appreciate your help.

Solution

You can try running the rolling sum on the reversed values for each group and then reverse back afterward, using a min_periods of 1:

df['Rolling Sum'] = df.groupby('Type', sort=False)['Value'].apply(lambda x: x[::-1].rolling(4, min_periods=1).sum()[::-1])

Result:

   Index        Type    Value   Rolling Sum
0      1        left        5          22.0
1      2        left        9          34.0
2      3        left        0          25.0
3      4        left        8          25.0
4      5        left       17          17.0
5      6    straight        7          61.0
6      7    straight        4          77.0
7      8    straight        0          86.0
8      9    straight       50          97.0
9     10    straight       23          47.0
10    11    straight       13          24.0
11    12    straight       11          11.0