I have two DataFrames, both of which have an x
column and a y
column. The x
values are unique within each DataFrame, but there is some overlap between the two.
I want to merge the two DataFrames, keeping only the x
values that appear in both DataFrames and copying the y
values from each DataFrame into its own column (y_1
or y_2
, depending on the source).
df1 = pd.DataFrame({'x': ['1', '2', '3', '4', '5', '6'],
'y': ['A', 'B', 'C', 'D', 'E', 'F']})
df2 = pd.DataFrame({'x': ['3', '4', '10', '2', '12', '6', '8', '9'],
'y': ['A', 'G', 'H', 'E', 'F', 'J', 'L', 'B']})
x y
0 1 A
1 2 B
2 3 C
3 4 D
4 5 E
5 6 F
x y
0 3 A
1 4 G
2 10 H
3 2 E
4 12 F
5 6 J
6 8 L
7 9 B
The result I want is:
x y_1 y_2
0 3 A C
1 4 G D
2 2 E B
3 6 J F
You can do it with pd.merge
pd.merge(df2, df1, on='x', how='inner', suffixes=('_1', '_2'))