pythonpandas

Pandas groupBy multiple columns and aggregation


In dataframe have 4 columns col_A,col_B,col_C,col_D.Need to group the columns(col_A,col_B,col_C) and aggregate mean by col_D. Below is the code snippet I tried and it worked

df.groupby(['col_A','col_B','col_C']).agg({'col_D':'mean'}).reset_index()

But in addition to the above result, also require the group by count of ('col_A','col_B','col_C') along with aggregation. Any help on this please.


Solution

  • Using Named Aggregation:

    result = (
        df.groupby(['col_A', 'col_B', 'col_C'], as_index=False)
          .agg(mean=('col_D', 'mean'), count=('col_D', 'count'))
    )
    

    For the count columns, you have 2 choices in choosing the aggregate function: