Consider the dataframe df
:
import pandas as pd
df = pd.DataFrame({'Event1': ['Music', 'Something else 1', 'Theatre', 'Comedy'],
'Event2': ['Something else 2', 'Ballet', 'Something else 3', 'Something else 4'],
'Cost': [10000, 5000, 15000, 2000]})
print(df)
Event1 Event2 Cost
0 Music Something else 2 10000
1 Something else 1 Ballet 5000
2 Theatre Something else 3 15000
3 Comedy 4 2000
I would like to map the values of the Event1
and Event2
to the values in the respective dictionaries:
# Mapping tables
dict1 = {'Music': 'M', 'Cooking': 'C', 'Theatre': 'T', 'Comedy': 'C'}
dict2 = {'Ballet': 'B', 'Swimming': 'S'}
And store the mappings in a common column because I know that per row, only the value of one column will be mapped. The end result would be:
# desired outcome
result = pd.DataFrame({'Event1': ['Music', 'Something else 1', 'Theatre', 'Comedy'],
'Event2': ['Something else 2', 'Ballet', 'Something else 3', '4'],
'Event': ['M', 'B', 'T', 'C'],
'Cost': [10000, 5000, 15000, 2000]})
print(result)
Event1 Event2 Event Cost
0 Music Something else 2 M 10000
1 Something else 1 Ballet B 5000
2 Theatre Something else 3 T 15000
3 Comedy 4 C 2000
I can only do this in a messy and lengthy way and was hoping there is clean maybe idiomatic way of doing this.
How would you advise doing it?
You can combine map
and fillna
:
df['Event'] = df['Event1'].map(dict1).fillna(df['Event2'].map(dict2))
Output:
Event1 Event2 Cost Event
0 Music Something else 2 10000 M
1 Something else 1 Ballet 5000 B
2 Theatre Something else 3 15000 T
3 Comedy Something else 4 2000 C
If you imagine a more complex input with an arbitrary number of columns, you could generalize with a dictionary to pair the columns/dictionaries, then:
dict3 = {'Comedy': 'C'}
dicts = {'Event1': dict1, 'Event2': dict2, 'Event3': dict3}
df['Event'] = (df[list(dicts)].apply(lambda s: s.map(dicts[s.name]))
.bfill(axis=1).iloc[:, 0]
)
Output:
Event1 Event2 Event3 Cost Event
0 Music Something else 2 Something else 10000 M
1 Something else 1 Ballet Something else 5000 B
2 Theatre Something else 3 Something else 15000 T
3 Something else Something else 4 Comedy 2000 C
The above approaches suppose that the mappings are per column, if you want global mappings, you can merge the dictionaries and use:
from collections import ChainMap
df['Event'] = (df.filter(like='Event').stack()
.map(dict(ChainMap(dict1, dict2, dict3)))
.groupby(level=0).first()
)
# or
from collections import ChainMap
df['Event'] = (df.filter(like='Event')
.map(dict(ChainMap(dict1, dict2, dict3)).get)
.bfill(axis=1).iloc[:, 0]
)