pythonpandasrolling-sum

Rolling sum of the next N elements, including the current element


Good Morning,

I have the following dataframe:

a = [1,2,3,4,5,6]
b = pd.DataFrame({'a': a})   

I would like to create a column that sums the next "n" rows of column "a", including the present value of a; I tried:

n = 2
b["r"] = pd.rolling_sum(b.a, n) + a 
print(b)
   a     r
0  1   NaN
1  2   5.0
2  3   8.0
3  4  11.0
4  5  14.0
5  6  17.0

It would be delightful to have:

   a     r
0  1   1 + 2 + 3 = 6
1  2   2 + 3 + 4 = 9
2  3   3 + 4 + 5 = 12
3  4   4 + 5 + 6 = 15
4  5   5 + 6 + 0 = 11
5  6   6 + 0 + 0 = 6

Solution

  • pandas >= 1.1

    Pandas now supports "forward-looking window operations", see here.

    From 1.1, you can use FixedForwardWindowIndexer

    idx = pd.api.indexers.FixedForwardWindowIndexer
    b['a'].rolling(window=idx(window_size=3), min_periods=1).sum()
    
    0     6.0
    1     9.0
    2    12.0
    3    15.0
    4    11.0
    5     6.0
    Name: a, dtype: float64
    

    Note that this is still (at the time of writing) very buggy for datetime rolling operations - use with caution.


    pandas <= 1.0.X

    Without builtin support, you can get your output by first reversing your data, using rolling_sum with min_periods=1, and reverse again.

    b.a[::-1].rolling(3, min_periods=1).sum()[::-1]
    
    0     6.0
    1     9.0
    2    12.0
    3    15.0
    4    11.0
    5     6.0
    Name: a, dtype: float64