I have a data for a period from December 2013 to November 2018. I converted it into a data frame as shown here.
Date 0.1 0.2 0.3 0.4 0.5 0.6
2013-12-01 301.04 297.4 296.63 295.76 295.25 295.25
2013-12-04 297.96 297.15 296.25 295.25 294.43 293.45
2013-12-05 298.4 297.61 296.65 295.81 294.75 293.89
2013-12-08 298.82 297.95 297.15 296.25 295.45 294.41
2013-12-09 298.65 297.65 296.95 296.02 295.13 294.05
2013-12-12 299.05 297.33 296.65 295.81 294.85 293.85
2013-12-16 301.05 300.28 299.38 298.45 297.65 296.51
....
2014-01-10 301.65 297.45 296.46 295.52 294.65 293.56
2014-01-11 301.99 298.95 298.39 297.15 296.05 295.11
2014-01-12 299.86 298.65 297.73 296.82 296.35 295.37
2014-01-13 299.25 298.15 297.3 296.43 295.26 294.31
I want to take monthly mean and seasonal mean of this data.
For monthly mean I have tried
df.resample('M').mean()
And it worked well.
For seasons, I would like decompose this data into 4 seasons (December-Feb; Mar-May; June-Aug; and Sep-Nov) of three months interval. While I tried the resample with 3 months interval. i.e.
df.resample('3M').mean()
However this is not worked well as it giving the average for the starting December month separately and then considering the above said interval for a calendar year (ie. from January to March and so on).
I would like to know if there are any possible ways to avoid this by specifying which month is our period of consideration begins.
Moreover, I would also like to know whether we can define these seasons beforehand and group the data accordingly to get averages with more ease.
You can define the origin in resample
:
df.resample('M', origin=pd.Timestamp('2013-12-01')).mean()