pythonpandaspandas-groupbypandas-rolling

How to get previous 4 week sales at a level


I would like to find previous four week sales at a level in Python. Say for example

ID  Category    Date    Sales
1   AA  7/02/2022   1
1   AA  31/01/2022  3
1   AA  24/01/2022  5
1   AA  10/01/2022  7
1   AA  03/01/2022  9
2   BB  7/02/2022   2
2   BB  31/01/2022  4
2   BB  24/01/2022  6
2   BB  17/01/2022  8
2   BB  10/01/2022  10

For 1 AA 7/02/2022 sum of last four weeks will be 9 (as 17/01/2022 bales is not there and must include current row date)


Solution

  • You could set the date as index, groupby Category and take the sum of a 28-day rolling window of Sales:

    import pandas as pd
    import io
    
    data = '''ID  Category    Date    Sales
    1   AA  7/02/2022   1
    1   AA  31/01/2022  3
    1   AA  24/01/2022  5
    1   AA  10/01/2022  7
    1   AA  03/01/2022  9
    2   BB  7/02/2022   2
    2   BB  31/01/2022  4
    2   BB  24/01/2022  6
    2   BB  17/01/2022  8
    2   BB  10/01/2022  10'''
    
    df = pd.read_csv(io.StringIO(data), sep='\s+')
    df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
    
    result_df = df.set_index('Date').sort_index().groupby('Category')['Sales'].rolling("28D").sum().reset_index()
    

    Output:

    Category Date Sales
    0 AA 2022-01-03 00:00:00 9
    1 AA 2022-01-10 00:00:00 16
    2 AA 2022-01-24 00:00:00 21
    3 AA 2022-01-31 00:00:00 15
    4 AA 2022-02-07 00:00:00 9
    5 BB 2022-01-10 00:00:00 10
    6 BB 2022-01-17 00:00:00 18
    7 BB 2022-01-24 00:00:00 24
    8 BB 2022-01-31 00:00:00 28
    9 BB 2022-02-07 00:00:00 20