pythongroup-byaggregate

Result based in other column using pandas aggregation


I'm looking for a way in pandas agg to find the value of a column, based in other column value.

For example: I have the following dataframe

df = pd.DataFrame({"Project":['A','B','C','D','E'],
                   "Country" :['Brazil','Brazil','Germany','Germany','Argentina'],\
                   "Value":[12,11,14,15,18]})

      Country Project  Value
0     Brazil       A     12
1     Brazil       B     11
2    Germany       C     14
3    Germany       D     15
4  Argentina       E     18

I have created this aggregation:

aggregations = {'Project':{'Number of projects':'count'},
                'Value':{'Mean':'mean',
                         'Max':'max',
                         'Min':'min'}}

df.groupby(['Country']).agg(aggregations)

I would like to add to this aggregation a new column which would give as result the name of the project which max of 'value' was observed. the intend result would be like:

                 Project Value        
             Number of Projects  Mean Max Min  Projec_Max  Projec_Min
 Country                        
 Argentina                    1  18.0  18  18           E         E           
 Brazil                       2  11.5  12  11           A         B
 Germany                      2  14.5  15  14           D         C

How can I implement this in the aggregation dictionary?

Thanks in advance


Solution

  • Not sure if this is the best way, but it seems to work:

    aggregations = {'Project':{'Number of projects':'count'},
                    'Value':{'Mean':'mean',
                             'Max':'max',
                             'Min':'min',
                             'Project_Max': lambda x: df['Project'][x.idxmax()],
                             'Project_Min': lambda x: df['Project'][x.idxmin()]}}
    df.groupby(['Country']).agg(aggregations)
    

    Result:

                    Value                                      Project
              Project_Max Project_Min Max  Mean Min Number of projects
    Country                                                           
    Argentina           E           E  18  18.0  18                  1
    Brazil              A           B  12  11.5  11                  2
    Germany             D           C  15  14.5  14                  2