pandaslambdagroup-by

Python Pandas - Prior Price Change Date based on Multiple Columns


I've done my due diligence and still can't find what I need to make this happen.

This is a sample dataset of what I'm using. I need to add the column "date_price_change" for that particular Store/SKU combination where the "price" changed last.

I've tried-

df['date_price_change'] = df.groupby['store_nbr', 'product_sku']['date'].max()

but that's giving me an error, plus I need to make sure that the "date change" is less than the row's date.

TIA

enter image description here


Solution

  • df = df.sort_values(by=['store_nbr', 'product_sku', 'date'])
    df['price_change'] = df.groupby(['store_nbr', 'product_sku'])['retail'].apply(lambda x: x != x.shift()).values
    df['date_price_change'] = df['date'].where(df['price_change']).ffill()