I have two DataFrames:
df1
PRA_Kod;WorkerId
1049;9024
0014;60260
0608;10506
20270/9;20270
9511;64473
0639;60264
0767;10509
.. .. .. . .
df2
WorkerId;Day;Time;W
1049;11;u.wyp;WE
1049;14;u.wyp;WE
64454;11;u.wyp;WE
0014;15;u.wyp;WE
64471;12;u.wyp;WE
64471;13;u.wyp;WE
0639;06;u.wyp;WE
0639;01;u.wyp;WE
0639;02;u.wyp;WE
0639;03;u.wyp;WE
64465;04;u.wyp;WE
64465;05;u.wyp;WE
... ... .. .
I need to change the value in the column df2['WorkerId']
to the value df1['WorkerId']
if df2['WorkerId']
appears in df1['PRA_Kod']
.
A note: PRA_Kod
value "20270/9" is no typo. These tables result from combining data from different tables. Some users happen to have different indexes accross different tables.
Desired output:
df_result
WorkerId;Day;Time;W
9024;11;u.wyp;WE
9024;14;u.wyp;WE
64454;11;u.wyp;WE
60260;15;u.wyp;WE
64471;12;u.wyp;WE
64471;13;u.wyp;WE
60264;06;u.wyp;WE
60264;01;u.wyp;WE
60264;02;u.wyp;WE
60264;03;u.wyp;WE
64465;04;u.wyp;WE
64465;05;u.wyp;WE
... ... .. .
How can this be done?
My attempt below results raises ValueError: Can only compare identically-labeled Series objects.
df_result.loc[ df2['WorkerId'].eq == df1['PRA_Kod'].eq,
df2['WorkerId'] ] = df1['WorkerId']
The main issue with your line, is that
==
, is requesting a screening for a row-by-row match of both columns in df1 and df2.Things will go much smoother when you first adapt df1 for use as dictionary.
Method:
1. Dictionary
It's not directly df1.to_dict()
, see? Either we write a loop, or we transpose it and rework columns and index, so that key:value pairs do land as needed:
map_dict = df1.set_index('PRA_Kod').T.to_dict('records')[0]
{'1049': '9024',
'0014': '60260',
'0608': '10506',
'20270/9': '20270',
'9511': '64473',
'0639': '60264',
'0767': '10509'}
2. Map
The mapping finds cells that are also dictionary keys, and replaces those with dictionary values. Other cells are kept as they were.
df_result = df2.assign(WorkerId = df2.WorkerId.map(map_dict).fillna(df2.WorkerId))
WorkerId Day Time W
0 9024 11 u.wyp WE
1 9024 14 u.wyp WE
2 64454 11 u.wyp WE
3 60260 15 u.wyp WE
4 64471 12 u.wyp WE
5 64471 13 u.wyp WE
6 60264 06 u.wyp WE
7 60264 01 u.wyp WE
8 60264 02 u.wyp WE
9 60264 03 u.wyp WE
10 64465 04 u.wyp WE
11 64465 05 u.wyp WE
Refer to each linked question for more details about how both steps work. Without finding this method for step 2, I might have gone for an np.where
, testing for value .isin
dictionary keys.