pythonpandasgroup-by

pandas groupby().transform() with custom condition


Suppose I have the following table:


import pandas as pd

data = pd.DataFrame({
'Group':['A','A','A','A','B','B']
, 'Month':[1,2,3,4,1,2]
, 'Value':[100,300,700,750, 200,400]
})

I would like to use groupby and transform functions in pandas to create a new column that is equal to the value of each group in month 2.

Here's how the result should look:

import pandas as pd

data = pd.DataFrame({
'Group':['A','A','A','A','B','B']
, 'Month':[1,2,3,4,1,2]
, 'Value':[100,300,700,750, 200,400]
, 'Desired_Result':[300,300,300,300,400,400]
})

It seems like there should be a straightforward way of accomplishing this with groupby and transform, but haven't found it yet.


Solution

  • Use Series.map with filtered rows in boolean indexing:

    s = data[data['Month'].eq(2)].set_index('Group')['Value']
    data['Desired_Result'] = data['Group'].map(s)
    
    print (data)
      Group  Month  Value  Desired_Result
    0     A      1    100             300
    1     A      2    300             300
    2     A      3    700             300
    3     A      4    750             300
    4     B      1    200             400
    5     B      2    400             400
    

    With GroupBy.transform is possible replace non matched values by NaNs and use first:

    s = data['Value'].where(data['Month'].eq(2))
    data['Desired_Result'] = s.groupby(data['Group']).transform('first')
    
    print (data)
      Group  Month  Value  Desired_Result
    0     A      1    100           300.0
    1     A      2    300           300.0
    2     A      3    700           300.0
    3     A      4    750           300.0
    4     B      1    200           400.0
    5     B      2    400           400.0