I have a pandas dataframe in the below format:
Class Category
XYZ ABC
XYZ ABC
XYZ DEF
XYZ1 ABC
XYZ1 ABC
XYZ1 ABC
XYZ1 HLR
XYZ2 ABC
For every unique class, if there are multiple observations for that class, I would like to assign the corresponding category to that class based on "majority voting".
For example, for "XYZ", Category should be "ABC".
For "XYZ1", category has to be "ABC" as well, because "HLR" appears only once.
If there are no discrepencies, then its straightforward (for "XYZ2", it would be "ABC").
Wondering is there a way to achieve this without storing the value counts in a table and then loop over it to groupby and assign categories based on majority voting.
Any leads would be appreciated.
mode
:from statistics import mode
df['New_Category'] = df.groupby('Class').transform(mode)
Class Category New_Categroy
0 XYZ ABC ABC
1 XYZ ABC ABC
2 XYZ DEF ABC
3 XYZ1 ABC ABC
4 XYZ1 ABC ABC
5 XYZ1 ABC ABC
6 XYZ1 HLR ABC
7 XYZ2 ABC ABC