The objective is to assign 1s to any index in the group that is a higher value than the one retrieved from idxmax()
import numpy as np
import pandas as pd
df = pd.DataFrame({'id':[1, 1, 1, 2, 2, 2, 3, 3, 3], 'val':[1,np.NaN, 0, np.NaN, 1, 0, 1, 0, 0]})
id val
0 1 1.0
1 1 NaN
2 1 0.0
3 2 NaN
4 2 1.0
5 2 0.0
6 3 1.0
7 3 0.0
8 3 0.0
We can use idxmax() to get the index values for the highest value in each group
test = df.groupby('id')['val'].idxmax()
id
1 0
2 4
3 6
The objective is to transform the data to look as such (which is that every value in group that has a higher index than the one from idxmax() gets assigned a 1.
id val
0 1 1.0
1 1 1.0
2 1 1.0
3 2 NaN
4 2 1.0
5 2 1.0
6 3 1.0
7 3 1.0
8 3 1.0
This question does not necessarily need to be done with idxmax(). Open to any suggestions.
If i understand correctly the problem, you can use apply
and np.where
nd = df.groupby('id')['val'].idxmax().tolist()
df['val'] = df.groupby('id')['val'].transform(lambda x: np.where(x.index>nd[x.name-1], 1, x))
df
Output:
id val
0 1 1.0
1 1 1.0
2 1 1.0
3 2 NaN
4 2 1.0
5 2 1.0
6 3 1.0
7 3 1.0
8 3 1.0
Considering the comment, it is probably best to have a dictionary in case the df.id column is not sequential:
nd = {k:v for k,v in zip(df.id.unique(),df.groupby('id')['val'].idxmax().tolist())}
df['val'] = df.groupby('id')['val'].transform(lambda x: np.where(x.index>nd[x.name], 1, x))
(the whole thing is significantly slower than the solution offered by not_a_robot)