pythonpandasaggregateresample

Pandas resampling data with bigger interval than a whole index range


Situation

I have the folowwing pandas timeseries data:

date predicted1
2001-03-13 0.994756
2005-08-22 0.551661
2000-05-07 0.001396

I need to take into account a case of resampling into bigger interval than a 5 years, for e.g. 10 years:

sample = data.set_index(pd.DatetimeIndex(data['date'])).drop('date', axis=1)['predicted1']
sample.resample('10Y').sum()

I get the following:

date
2000-12-31 0.001396
2010-12-31 1.546418

So resampling function groups data for the first year and separetely for other years.

Question

How to group all data to the 10 year interval? I want to get smth like this:

date
2000-12-31 1.5478132011506138

Solution

  • You can change the reference, closing and label in resample:

    sample.resample('10Y', origin=sample.index.min(), closed='left', label='left').sum()
    

    Output:

    date
    1999-12-31    1.547813
    Freq: 10A-DEC, Name: predicted1, dtype: float64