pythonpandasdataframedata-analysisexploratory-data-analysis

Drop rows based on group by of another column in pandas


Having a data set as below:

img

I need to do the cartesian of product based on month and location. Need an output as below:

img2

I created a new dataframe-with the unique values of product. Then cross merged the df with dataset.need to drop the rows based on the month,location and product

img3


Solution

  • You can try groupby then cross merge on Product column

    out = (df.groupby(['Month', 'Location'])
           .apply(lambda g: g[['Product']].merge(g[['Product']], how='cross'))
           .droplevel(2)
           .reset_index()
           .rename(columns={'Product_x': 'Product', 'Product_y': 'Destination'}))
    
    print(out)
    
       Month  Location Product Destination
    0     17  Banglore       A           A
    1     17  Banglore       A           B
    2     17  Banglore       B           A
    3     17  Banglore       B           B
    4     18  Banglore       C           C
    5     18       GOA       D           D
    6     18       GOA       D           B
    7     18       GOA       B           D
    8     18       GOA       B           B