I have a pandas Series with measurements at a 1 minute interval. I want to downsample this data to a 5 minute interval. series
contains measurements from at end of October 18th, none from October 19th and then measurements at the start of October 20th. Using series.resample("5T").mean()
fills October 19th with NaN
's, and series.resample("5T").sum()
fills the missing day with 0
's:
index1 = pd.date_range("2023-10-18 23:50", "2023-10-18 23:59", freq="T")
index2 = pd.date_range("2023-10-20 00:00", "2023-10-20 00:10", freq="T")
series1 = pd.Series(range(len(index1)), index=index1)
series2 = pd.Series(range(100, len(index2)+100), index=index2)
series = pd.concat([series1, series2])
series.resample("5T").mean()
Out:
2023-10-18 23:50:00 2.0
2023-10-18 23:55:00 7.0
2023-10-19 00:00:00 NaN
2023-10-19 00:05:00 NaN
2023-10-19 00:10:00 NaN
...
2023-10-19 23:50:00 NaN
2023-10-19 23:55:00 NaN
2023-10-20 00:00:00 102.0
2023-10-20 00:05:00 107.0
2023-10-20 00:10:00 110.0
Freq: 5T, Length: 293, dtype: float64
I need pd.Series.resample
to stick to the days that are in series
and not fill in anything for the missing day. How can this be done?
You could consider grouping by the date first, then resampling.
series.groupby(series.index.date).resample("5T").mean()
2023-10-18 2023-10-18 23:50:00 2.0
2023-10-18 23:55:00 7.0
2023-10-20 2023-10-20 00:00:00 102.0
2023-10-20 00:05:00 107.0
2023-10-20 00:10:00 110.0
dtype: float64
Add .droplevel(0)
if you don't want the date in the output.