I would like to find previous four week sales at a level in Python. Say for example
ID Category Date Sales
1 AA 7/02/2022 1
1 AA 31/01/2022 3
1 AA 24/01/2022 5
1 AA 10/01/2022 7
1 AA 03/01/2022 9
2 BB 7/02/2022 2
2 BB 31/01/2022 4
2 BB 24/01/2022 6
2 BB 17/01/2022 8
2 BB 10/01/2022 10
For 1 AA 7/02/2022 sum of last four weeks will be 9 (as 17/01/2022 bales is not there and must include current row date)
You could set the date as index, groupby Category and take the sum of a 28-day rolling window of Sales:
import pandas as pd
import io
data = '''ID Category Date Sales
1 AA 7/02/2022 1
1 AA 31/01/2022 3
1 AA 24/01/2022 5
1 AA 10/01/2022 7
1 AA 03/01/2022 9
2 BB 7/02/2022 2
2 BB 31/01/2022 4
2 BB 24/01/2022 6
2 BB 17/01/2022 8
2 BB 10/01/2022 10'''
df = pd.read_csv(io.StringIO(data), sep='\s+')
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
result_df = df.set_index('Date').sort_index().groupby('Category')['Sales'].rolling("28D").sum().reset_index()
Output:
Category | Date | Sales | |
---|---|---|---|
0 | AA | 2022-01-03 00:00:00 | 9 |
1 | AA | 2022-01-10 00:00:00 | 16 |
2 | AA | 2022-01-24 00:00:00 | 21 |
3 | AA | 2022-01-31 00:00:00 | 15 |
4 | AA | 2022-02-07 00:00:00 | 9 |
5 | BB | 2022-01-10 00:00:00 | 10 |
6 | BB | 2022-01-17 00:00:00 | 18 |
7 | BB | 2022-01-24 00:00:00 | 24 |
8 | BB | 2022-01-31 00:00:00 | 28 |
9 | BB | 2022-02-07 00:00:00 | 20 |