In [10]: df
Out[10]:
PART AVAILABLE_INVENTORY DEMAND
1 A 12 6
2 A 12 2
3 A 12 1
4 B 24 1
5 B 24 1
6 B 24 4
7 B 24 3
Output wanted:
PART AVAILABLE_INVENTORY DEMAND AI AI_AFTER
1 A 12 6 12 6
2 A 12 2 6 4
3 A 12 1 4 3
4 B 24 1 24 23
5 B 24 1 23 22
6 B 24 4 22 18
7 B 24 3 18 15
The code I have so far is below but it is not giving the output I am looking for:
def retain(df):
df['PREV_PART'] = df['PART'].shift()
df['PREV_AI_AFTER'] = df['AI'].shift() - df['DEMAND'].shift()
df['AI'] = np.where(df['PART'] != df['PREV_PART'], df['AI'], df['PREV_AI_AFTER'])
df['AI_AFTER'] = df['AI'] - df['DEMAND']
df['AI'] = df['AVAILABLE_INVENTORY']
retain(df)
What is the fastest way to do this with performance in mind?
you can do it with groupby
with cumsum
on the column 'DEMAND' and shift
on the column 'AI_AFTER' just created before:
df['AI_AFTER'] = df['AVAILABLE_INVENTORY'] - df.groupby('PART')['DEMAND'].cumsum()
df['AI'] = df.groupby('PART')['AI_AFTER'].shift().fillna(df['AVAILABLE_INVENTORY'])
print (df)
PART AVAILABLE_INVENTORY DEMAND AI_AFTER AI
1 A 12 6 6 12.0
2 A 12 2 4 6.0
3 A 12 1 3 4.0
4 B 24 1 23 24.0
5 B 24 1 22 23.0
6 B 24 4 18 22.0
7 B 24 3 15 18.0