pythonpandasnumpy

Get a single series of classes instead of one series for each class with pandas in Python


I have a DataFrame with 3 column of zeroes and ones corresponding to 3 different classes. I want to get a single series of zeroes, ones, and twos depending of the class of the entry (0 for the first class, 1 for the second one and 2 for the third one):

>>> results.head()
    HOME_WINS  DRAW  AWAY_WINS
ID                            
0           0     0          1
1           0     1          0
2           0     0          1
3           1     0          0
4           0     1          0

What I want :

>>> results.head()
    SCORE
ID                            
0       2
1       1
2       2
3       0
4       1

Solution

  • Multiply by a dictionary, sum and convert to_frame:

    d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}
    
    out = df.mul(d).sum(axis=1).to_frame(name='SCORE')
    

    Or using a dot product:

    d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}
    
    out = df.dot(pd.Series(d)).to_frame(name='SCORE')
    

    Or, if there is exactly one 1 per row, with from_dummies:

    d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}
    
    out = pd.from_dummies(df)[''].map(d).to_frame(name='SCORE')
    

    Output:

        SCORE
    ID       
    0       2
    1       1
    2       2
    3       0
    4       1