Looking at the table, I am looking to replicate the Rollover column in a Python dataframe. The Rollover looks at:
Rollover is created when production > max capacity. The following date will look at the previous day and will either add the delta of production - max capacity or will subtract the delta of max capacity - production.
Rollover will never be less than 0.
You can identify the index of the first time production > max_capacity
, set all previous entries equal to 0, then perform a cumulative sum (rounding any negative values up to 0, see this answer). Here is a reproducible example and result:
import numpy as np
import pandas as pd
df = pd.DataFrame({'DATE': pd.date_range('2024-01-01','2024-01-13'),'PRODUCTION': [10,8,8,15,15,12,10,9,14,16,2,2,2],'MAX_CAPACITY':[11]*13})
df['EXTRA'] = df['PRODUCTION'] - df['MAX_CAPACITY']
first_pos_idx = df['EXTRA'].gt(0).idxmax()
df.loc[:first_pos_idx-1, "EXTRA"] = 0
## cumulative sum where negative results are rounded up to 0
sumlm = np.frompyfunc(lambda a,b: 0 if a+b < 0 else a+b,2,1)
df['ROLLOVER'] = sumlm.accumulate(df['EXTRA'].values, dtype=np.object_)
Result:
DATE PRODUCTION MAX_CAPACITY EXTRA ROLLOVER
0 2024-01-01 10 11 0 0
1 2024-01-02 8 11 0 0
2 2024-01-03 8 11 0 0
3 2024-01-04 15 11 4 4
4 2024-01-05 15 11 4 8
5 2024-01-06 12 11 1 9
6 2024-01-07 10 11 -1 8
7 2024-01-08 9 11 -2 6
8 2024-01-09 14 11 3 9
9 2024-01-10 16 11 5 14
10 2024-01-11 2 11 -9 5
11 2024-01-12 2 11 -9 0
12 2024-01-13 2 11 -9 0