pythonpandascategorical-data

Recoding categorical variables in pandas, different mapping for each column


I have a dataframe of categorical data that I would like to recode. Each column would have its separate dictionary of recoded values.

Below is toy example of the code I have thus far:

import pandas as pd

ser = pd.DataFrame({'a':[1,3,3,1], 'b':[2,2,4,5]})
print(ser)

a_dict = {1:11, 3:33}
b_dict = {2:22, 4:44, 5:55}
ser.a = ser.a.map(a_dict)
ser.b = ser.b.map(b_dict)
print(ser)

(Of course my real data has much more than 2 columns.) Is there a more concise way of mapping (applying) over the entire dataframe?


Solution

  • replace can take a tiered dictionary where the first tier's keys are the names of columns and the values are the dictionaries to use for the replacement in the respective columns.

    ser.replace(dict(a=a_dict, b=b_dict))
    
        a   b
    0  11  22
    1  33  22
    2  33  44
    3  11  55