I'd like to fill in the following dataframe with dates between a range and front fill all the columns. As I am completing this, I would like to append the lists in the Wells column so it becomes continues to add items by date.
Dataframe to expand by date
StartDate Wells count Sum_Cumm vol
0 1967-10-01 [MUN-523, MUN-354, MUN-2660] 50 50 8503.323620
1 1968-01-01 [MUN-152] 1 51 8336.591784
2 1968-03-01 [MUN-1032] 1 52 8176.272712
3 1968-10-01 [MUN-16128] 1 53 9191.110200
Code that I'm working on
newdf = (newdf.set_index('StartDate').reindex(pd.date_range('10-01-1967', '12-31-1994', freq='MS')).rename_axis(['StartDate']).reset_index()).ffill(newdf['vol'])
Dataframe that I'd like to end up with
StartDate Wells count Sum_Cumm vol
0 1967-10-01 [MUN-523, MUN-354, MUN-2660] 50 50 8503.323620
1 1967-11-01 [MUN-523, MUN-354, MUN-2660] 1 51 8503.323620
2 1967-12-01 [MUN-523, MUN-354, MUN-2660] 1 51 8503.323620
3 1968-01-01 [MUN-523, MUN-354, MUN-2660,MUN-152] 1 52 8336.591784
4 1968-02-01 [MUN-523, MUN-354, MUN-2660,MUN-152] 1 53 8336.591784
5 1968-03-01 [MUN-523, MUN-354, MUN-2660,MUN-152,MUN-1032] 1 53 8176.272712
6 1968-04-01 [MUN-523, MUN-354, MUN-2660,MUN-152,MUN-1032] 1 53 8176.272712
You can use period_range
to create a new index and reindex the existing df to create a new dataframe and then ffill.
For Wells
column do a cumsum
and then apply np.unique
new_idx = pd.to_datetime(
pd.period_range(df["StartDate"].min(), df["StartDate"].max(), freq="M")
.asfreq("D", how="S")
.strftime("%Y-%m-%d")
)
df2 = df.set_index("StartDate").reindex(new_idx)
df2 = df2.ffill(downcast="infer")
df2["Wells"] = df2["Wells"].cumsum().apply(np.unique)
df2 = df2.rename_axis("StartDate").reset_index()
print(df2)
StartDate Wells count \
0 1967-10-01 [MUN-2660, MUN-354, MUN-523] 50
1 1967-11-01 [MUN-2660, MUN-354, MUN-523] 50
2 1967-12-01 [MUN-2660, MUN-354, MUN-523] 50
3 1968-01-01 [MUN-152, MUN-2660, MUN-354, MUN-523] 1
4 1968-02-01 [MUN-152, MUN-2660, MUN-354, MUN-523] 1
5 1968-03-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
6 1968-04-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
7 1968-05-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
8 1968-06-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
9 1968-07-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
10 1968-08-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
11 1968-09-01 [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523] 1
12 1968-10-01 [MUN-1032, MUN-152, MUN-16128, MUN-2660, MUN-3... 1
Sum_Cumm vol
0 50 8503.323620
1 50 8503.323620
2 50 8503.323620
3 51 8336.591784
4 51 8336.591784
5 52 8176.272712
6 52 8176.272712
7 52 8176.272712
8 52 8176.272712
9 52 8176.272712
10 52 8176.272712
11 52 8176.272712
12 53 9191.110200