After mapping by column c, If column A has a value, insert the value of column A; if not, insert column B.
data1 data2
a b c a c d
a1 b1 c1 1a c1 1d
b2 c2 2a c2 2d
a3 c3 3a c3 3d
4a c4 4d
The result I want
result
a b c
a1 b1 c1
2a b2 c2
a3 c3
I tried the following, but I was not satisfied.
->>> result = data1.merge(data2, on=['c'])
Prefixes _x and _y are created. combine_first is not applied.
->>> result = data1.combine_first(data2)
It is not mapped by column c.
How do I get good results? I ask for your help. thank you
Using @IdoS setup:
import pandas as pd
data1 = pd.DataFrame({'a': ['a1', None, 'a3'],
'b': ['b1', 'b2', None],
'c': ['c1', 'c2', 'c3']})
data2 = pd.DataFrame({'a': ['1a', '2a', '3a', '4a'],
'c': ['c1', 'c2', 'c3', 'c4'],
'd': ['1d', '2d', '3d', '4d']})
You can use set_index
, combine_first
, and reindex:
df_out = data1.set_index('c').combine_first(data2.set_index('c'))\
.reindex(data1.c)\
.reset_index()
df_out
Output:
c a b d
0 c1 a1 b1 1d
1 c2 2a b2 2d
2 c3 a3 None 3d