pythonpandasmergetime-seriesouter-join

Outer merge two dataframe in python where the left dataframe is larger than the first


I am trying to perform an outer join (union) of two time series in python. The left time series is larger than the first.

An example:

Right time series - df_1

Time Series 1
3 1
4 2
5 3

Left time series - df_2

Time Series 2
0 10
1 11
2 12
3 13
4 14
5 15
6 16

I simply tried:

pd.merge([df_1, df_2], how = "outer", on = "Time")

I expected to get something like this:

Time Series 1 Series 2
0 nan 10
1 nan 11
2 nan 12
3 1 13
4 2 14
5 3 15
6 nan 16

Instead I got this:

Time Series 1 Series 2
3 1 13
4 2 14
5 3 15
6 nan 16

It is crucial for the exercise to keep the time series in the correct sequence, I do not want to switch the order of the merge.


Solution

  • instead of using 'outer' in how argument you can use 'right'. Below is the sample code:

    import pandas as pd
    import numpy as np
    
    df1 = pd.DataFrame({'Time': ['time 3', 'time 4', 'time 5'],
         'Series 1': [1, 2, 3]})
    df2 = pd.DataFrame({'Time': ['time 0', 'time 1', 'time 2', 'time 3', 'time 4', 'time 5', 'time 6'],'Series 2': [10, 11, 12, 13, 14, 15, 16]})
    
    merged_df = pd.merge(df1, df2, on='Time', how='right')
    merged_df = merged_df.sort_values('Time')
    merged_df = merged_df.reset_index(drop=True)
    merged_df['Series 1'] = merged_df['Series 1'].fillna(np.nan)
    

    I hope this will solve your problem.