pythonpandasdataframe

Pandas copy dataframe keeping only max value for rows with same index


If I have a dataframe that looks like

    value   otherstuff
0     4         x
0     5         x
0     2         x
1     2         x
2     3         x
2     7         x

what is a succinct way to get a new dataframe that looks like

    value   otherstuff
0     5         x
1     2         x
2     7         x

where rows with the same index have been dropped so only the row with the maximum 'value' remains? As far as I am aware there is no option in df.drop_duplicates to keep the max, only the first or last occurrence.


Solution

  • You can use max with level=0:

    df.max(level=0)
    

    Output:

       value otherstuff
    0      5          x
    1      2          x
    2      7          x
    

    OR, to address other columns mentioned in comments:

    df.groupby(level=0,group_keys=False)\
      .apply(lambda x: x.loc[x['value']==x['value'].max()])
    

    Output:

       value otherstuff
    0      5          x
    1      2          x
    2      7          x