pythonpandasdataframe

Pandas - Interleave / Zip two DataFrames by row


Suppose I have two dataframes:

>> df1

   0  1  2
0  a  b  c
1  d  e  f

>> df2

   0  1  2
0  A  B  C
1  D  E  F

How can I interleave the rows? i.e. get this:

>> interleaved_df

   0  1  2
0  a  b  c
1  A  B  C
2  d  e  f
3  D  E  F

(Note my real DFs have identical columns, but not the same number of rows).


What I've tried

inspired by this question (very similar, but asks on columns):

import pandas as pd
from itertools import chain, zip_longest

df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])

concat_df = pd.concat([df1,df2])

new_index = chain.from_iterable(zip_longest(df1.index, df2.index))
# new_index now holds the interleaved row indices

interleaved_df = concat_df.reindex(new_index)

ValueError: cannot reindex from a duplicate axis

The last call fails because df1 and df2 have some identical index values (which is also the case with my real DFs).

Any ideas?


Solution

  • You can sort the index after concatenating and then reset the index i.e

    import pandas as pd
    
    df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
    df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
    
    concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)
    

    Output :

       0  1  2
    0  a  b  c
    1  A  B  C
    2  d  e  f
    3  D  E  F
    

    EDIT (OmerB) : Incase of keeping the order regardless of the index value then.

    import pandas as pd
    df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()  
    df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()
    
    concat_df = pd.concat([df1,df2]).sort_index().set_index('index')