pythonpandasdataframegroup-by

Issue in Pandas Dataframe grouping and getting the difference of a column


I'm struck with a problem, i have date frame as below, it has data for distributor who supply the items for different locations, now i want to calculate, for a particular day, does any item ( example: apple) had different cost at a different location ( for the case of apple the price in KA and TN is same but AP it differs), how to solve this, i tried grouping with date,item,price and location not getting how to proceed or what i'm trying is right or wrong.

is it also possible to get the difference cost if there a difference?

   distributor  Location    Date    item    cost
0   GEC           KA    2025-02-01  Apple   10.0
1   GEC           TN    2025-02-01  Apple   10.0
2   GEC           AP    2025-02-01  Apple   9.0
3   GEC           KA    2025-02-02  Orange  8.0
4   GEC           TN    2025-02-02  Orange  7.0
5   GEC           AP    2025-02-02  Orange  8.5
6   GEC           KA    2025-02-03  Banana  6.0
7   GEC           TN    2025-02-03  Banana  6.0
8   GEC           AP    2025-02-03  Banana  6.0

P.S: if you need any more info please ask, please don't don vote the question i badly need an answer


Solution

  • Something like this could work.

    df.assign(difference = df.groupby(['Date','item'])['cost'].transform('max').sub(df['cost']))
    

    Output:

      distributor Location        Date    item  cost  difference
    0         GEC       KA  2025-02-01   Apple  10.0         0.0
    1         GEC       TN  2025-02-01   Apple  10.0         0.0
    2         GEC       AP  2025-02-01   Apple   9.0         1.0
    3         GEC       KA  2025-02-02  Orange   8.0         0.5
    4         GEC       TN  2025-02-02  Orange   7.0         1.5
    5         GEC       AP  2025-02-02  Orange   8.5         0.0
    6         GEC       KA  2025-02-03  Banana   6.0         0.0
    7         GEC       TN  2025-02-03  Banana   6.0         0.0
    8         GEC       AP  2025-02-03  Banana   6.0         0.0