I have a list of data frames and I need to merge them together using a unique column (date)
. Field names are different so concat is out.
I can manually use df[0].merge(df[1],on='Date').merge(df[3],on='Date)
etc. to merge each df one by one, but the issue is that the number of data frames in the list differs with user input.
Is there any way to merge that just combines all data frames in a list in one go? Or perhaps some for loop that does that?
You can use reduce
function where dfList
is your list of data frames:
import pandas as pd
from functools import reduce
reduce(lambda x, y: pd.merge(x, y, on = 'Date'), dfList)
As a demo:
df = pd.DataFrame({'Date': [1,2,3,4], 'Value': [2,3,3,4]})
dfList = [df, df, df]
dfList
# [ Date Value
# 0 1 2
# 1 2 3
# 2 3 3
# 3 4 4, Date Value
# 0 1 2
# 1 2 3
# 2 3 3
# 3 4 4, Date Value
# 0 1 2
# 1 2 3
# 2 3 3
# 3 4 4]
reduce(lambda x, y: pd.merge(x, y, on = 'Date'), dfList)
# Date Value_x Value_y Value
# 0 1 2 2 2
# 1 2 3 3 3
# 2 3 3 3 3
# 3 4 4 4 4