pandasvoting

Find majority elements in a dataframe (PANDAS)


I need to construct a majority voting (3/5) based on the (int64) elements in the various columns as new column (Voting)

     Column1  Column2 Column3 Column4 Column5
0   0   0   6   1   0
1   4   4   6   4   0
2   4   2   2   2   2
3   4   4   4   4   4
4   0   0   0   2   4
5   6   6   6   6   6
6   3   3   3   3   5
7   0   6   6   0   4
8   3   3   3   3   4
9   2   2   4   2   2

My expecting result is like:

     Column1  Column2 Column3 Column4 Column5 Voting
0   0   0   6   1   0       0
1   4   4   6   4   0       4      
2   4   2   2   2   2       2
3   4   4   4   4   4       4
4   0   0   0   2   4       0
5   6   6   6   6   6       6
6   3   3   3   3   5       3
7   0   6   6   0   4      -1
8   3   3   3   3   4       3
9   2   2   4   3   3      -1

where -1 is printed when we have pair number of elements.

Thanks a lot. 

Solution

  • you can use mode and np.where():

    import numpy as np
    df['Voting']=np.where(df.mode(axis=1)[1].notnull(),-1,df.mode(axis=1)[0])
    print(df)
    '''
       Column1  Column2  Column3  Column4  Column5  Voting
    0        0        0        6        1        0     0.0
    1        4        4        6        4        0     4.0
    2        4        2        2        2        2     2.0
    3        4        4        4        4        4     4.0
    4        0        0        0        2        4     0.0
    5        6        6        6        6        6     6.0
    6        3        3        3        3        5     3.0
    7        0        6        6        0        4    -1.0
    8        3        3        3        3        4     3.0
    9        2        2        4        2        2     2.0
    '''