[SOLVED] pandas groupby().transform() with custom condition

pandas groupby().transform() with custom condition

Suppose I have the following table:


import pandas as pd

data = pd.DataFrame({
'Group':['A','A','A','A','B','B']
, 'Month':[1,2,3,4,1,2]
, 'Value':[100,300,700,750, 200,400]
})

I would like to use groupby and transform functions in pandas to create a new column that is equal to the value of each group in month 2.

Here's how the result should look:

import pandas as pd

data = pd.DataFrame({
'Group':['A','A','A','A','B','B']
, 'Month':[1,2,3,4,1,2]
, 'Value':[100,300,700,750, 200,400]
, 'Desired_Result':[300,300,300,300,400,400]
})

It seems like there should be a straightforward way of accomplishing this with groupby and transform, but haven't found it yet.

Solution

Use Series.map with filtered rows in boolean indexing:

s = data[data['Month'].eq(2)].set_index('Group')['Value']
data['Desired_Result'] = data['Group'].map(s)

print (data)
  Group  Month  Value  Desired_Result
0     A      1    100             300
1     A      2    300             300
2     A      3    700             300
3     A      4    750             300
4     B      1    200             400
5     B      2    400             400

With GroupBy.transform is possible replace non matched values by NaNs and use first:

s = data['Value'].where(data['Month'].eq(2))
data['Desired_Result'] = s.groupby(data['Group']).transform('first')

print (data)
  Group  Month  Value  Desired_Result
0     A      1    100           300.0
1     A      2    300           300.0
2     A      3    700           300.0
3     A      4    750           300.0
4     B      1    200           400.0
5     B      2    400           400.0