I have a data frame like the one below.
import pandas as pd
data = {'Date': ['2022/09/01', '2022/09/02', '2022/09/03', '2022/09/04', '2022/09/05','2022/09/01', '2022/09/02', '2022/09/03', '2022/09/04', '2022/09/05','2022/09/01', '2022/09/02', '2022/09/03', '2022/09/04', '2022/09/05'],
'Runner': ['Runner A', 'Runner A', 'Runner A', 'Runner A', 'Runner A','Runner B', 'Runner B', 'Runner B', 'Runner B', 'Runner B','Runner C', 'Runner C', 'Runner C', 'Runner C', 'Runner C'],
'Training Time': ['less than 1 hour', 'less than 1 hour', 'less than 1 hour', 'less than 1 hour', '1 hour to 2 hour','less than 1 hour', '1 hour to 2 hour', 'less than 1 hour', '1 hour to 2 hour', '2 hour to 3 hour', '1 hour to 2 hour ', '2 hour to 3 hour' ,'1 hour to 2 hour ', '2 hour to 3 hour', '2 hour to 3 hour']
}
df = pd.DataFrame(data)
I have counted the occurrence for each runner using the below code
s = df.groupby(['Runner','Training Time']).size()
s
The result is like below.
And when I use below code to get the max occurence.
df = s.loc[s.groupby(level=0).idxmax()].reset_index().drop(0,axis=1)
df
Result
The problem is on Runner B. It should show "1 hour to 2 hour" and "less than 1 hour". But now it only shows ""1 hour to 2 hour"
How can I fix the issue? Thanks. Expected Result
thsi is the best i could come up with ... its kind of gross
def agg_most_common(vals):
print("vals")
matches = []
for i in collections.Counter(vals).most_common():
if not matches or matches[0][1] == i[1]:
matches.append(i)
else:
break
return [x[0] for x in matches]
print(df.groupby('Runner')['Training Time'].agg(agg_most_common))