pythonpython-3.xpandasjoinpandas-merge

pandas: merge (join) two data frames on multiple columns


I am trying to join two pandas dataframes using two columns:

new_df = pd.merge(A_df, B_df,  how='left', left_on='[A_c1,c2]', right_on = '[B_c1,c2]')

but got the following error:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4164)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4028)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13166)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13120)()

KeyError: '[B_1, c2]'

Any idea what should be the right way to do this?


Solution

  • Try this

    new_df = pd.merge(
        left=A_df, 
        right=B_df,
        how='left',
        left_on=['A_c1', 'c2'],
        right_on=['B_c1', 'c2'],
    )
    

    https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

    left_on : label or list, or array-like Field names to join on in left DataFrame. Can be a vector or list of vectors of the length of the DataFrame to use a particular vector as the join key instead of columns

    right_on : label or list, or array-like Field names to join on in right DataFrame or vector/list of vectors per left_on docs