I would like to use the rolling method of Pandas. I need a slight adjustment, however, I would like the 'value' placed at the top of the 'window'.
Currently, I am using this:
self.df['new_col'] = self.df['Zone3'].rolling(4).sum()
Which is producing this:
Date Time Period Zone4 Zone3 new_col
0 2018-02-23 00:00 900 11 2 NaN
1 2018-02-23 00:15 900 11 1 NaN
2 2018-02-23 00:30 900 7 3 NaN
3 2018-02-23 00:45 900 2 0 6.0
4 2018-02-23 01:00 900 3 2 6.0
5 2018-02-23 01:15 900 7 0 5.0
6 2018-02-23 01:30 900 2 4 6.0
What I would actually like:
Date Time Period Zone4 Zone3 new_col
0 2018-02-23 00:00 900 11 2 6.0
1 2018-02-23 00:15 900 11 1 6.0
2 2018-02-23 00:30 900 7 3 5.0
3 2018-02-23 00:45 900 2 0 6.0
4 2018-02-23 01:00 900 3 2 NaN
5 2018-02-23 01:15 900 7 0 NaN
6 2018-02-23 01:30 900 2 4 NaN
Notice the sum value is at the start of the window (position 1), and not at the end (position 4)
If rolling is the wrong method, great, any method would help. I know how to do this in a "pythonic" way (using a for loop), I was just hoping to do it in place with the dataframe using pandas.
Thank you in advance
Use shift
:
self.df['new_col'] = self.df['Zone3'].rolling(4).sum().shift(-3)
Or more general:
N = 4
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
Date Time Period Zone4 Zone3 new_col
0 2018-02-23 00:00 900 11 2 6.0
1 2018-02-23 00:15 900 11 1 6.0
2 2018-02-23 00:30 900 7 3 5.0
3 2018-02-23 00:45 900 2 0 6.0
4 2018-02-23 01:00 900 3 2 NaN
5 2018-02-23 01:15 900 7 0 NaN
6 2018-02-23 01:30 900 2 4 NaN
N = 2
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
Date Time Period Zone4 Zone3 new_col
0 2018-02-23 00:00 900 11 2 3.0
1 2018-02-23 00:15 900 11 1 4.0
2 2018-02-23 00:30 900 7 3 3.0
3 2018-02-23 00:45 900 2 0 2.0
4 2018-02-23 01:00 900 3 2 2.0
5 2018-02-23 01:15 900 7 0 4.0
6 2018-02-23 01:30 900 2 4 NaN