pythonpandasdataframe

Pandas - get first occurrence of a given column value


I have a df with repeated names for different rounds of a tournament, like so:

name   round_id price_open
John       1     5.0
Paul       1     4.0
John       2     5.4
Paul       2     3.4
John       3     5.0
Paul       3     4.0

But at round 3, a new player enters the tournament:

...
George     3     6.0
...

Lets say I need to filter down all starting prices, like so:

df_open = df[df['round_id']==1]['price_open']

This will get NaN for George, which is not what I need.


So how do I filter this df in order to get first opening prices for all players, ending up with?

name  price_open
John   5.0
Paul   4.0
George 6.0 

Solution

  • Use drop_duplicates to keep the first instance of each name:

    >>> df.drop_duplicates('name')
         name  round_id  price_open
    0    John         1         5.0
    1    Paul         1         4.0
    6  George         3         6.0