pythonpandasmergegroupdataspell

Merge two Dataframe without duplicate rows pandas


We have problem in merge two different Pandas data frame, using the merge method it duplicate the rows with the same ID

We have two different DF in Pandas:

DF1:

   ID  Gender
0   1    Male
1   2  Female
2   3  Female

DF2

   ID  Vote
0   1    30
1   2    27
2   2    22

We want this result as output:

   ID  Gender  Vote 1  Vote 2
0   1    Male      30    <NA>
1   2  Female      27      22
2   3  Female    <NA>    <NA>

Solution

  • You have to reshape your second dataframe before merging to the first one:

    df2a = df2.pivot_table(index='ID', columns=df2.groupby('ID').cumcount().add(1), 
                           values='Vote').add_prefix('Vote ')
    
    out = df1.merge(df2a, on='ID', how='left')
    print(out)
    
    # Output
       ID  Gender  Vote 1  Vote 2
    0   1    Male    30.0     NaN
    1   2  Female    27.0    22.0
    2   3  Female     NaN     NaN