I have a monthly mean precipitation netCDF file that covers a period of five years. The specifics of the file are:
<xarray.Dataset>
Dimensions: (time: 60, rlat: 412, rlon: 424)
Coordinates:
lat (rlat, rlon) float64 21.99 22.03 22.07 22.11 ... 66.81 66.75 66.69
lon (rlat, rlon) float64 -10.06 -9.964 -9.864 ... 64.55 64.76 64.96
* rlat (rlat) float64 -23.38 -23.26 -23.16 -23.05 ... 21.61 21.73 21.83
* rlon (rlon) float64 -28.38 -28.26 -28.16 -28.05 ... 17.93 18.05 18.16
* time (time) datetime64[ns] 1951-01-01 1951-02-01 ... 1955-12-01
Data variables:
pr (time, rlat, rlon) float32 2.053e-18 2.053e-18 ... 8.882e-06
Attributes: (12/22)
CDI: Climate Data Interface version 1.3.2
Conventions: CF-1.6
NCO: 4.4.2
CDO: Climate Data Operators version 1.3.2 (htt...
contact: Fredrik Boberg, Danish Meteorological Ins...
creation_date: 2019-10-15 18:05:48
... ...
rcm_version_id: v1
project_id: CORDEX
CORDEX_domain: EUR-11
product: output
tracking_id: hdl:21.14103/a879aaf7-ddeb-436a-96fd-b717...
c3s_disclaimer: This data has been produced in the contex...
From these data I need only the montly precipitation of months between May and October. I did this by:
out_ds = out_ds.sel(time=out_ds.time.dt.month.isin([5, 6, 7, 8, 9, 10]))
where out_ds is my original xarray. After this selection the data look like
<xarray.Dataset>
Dimensions: (time: 30, rlat: 412, rlon: 424)
Coordinates:
lat (rlat, rlon) float64 21.99 22.03 22.07 22.11 ... 66.81 66.75 66.69
lon (rlat, rlon) float64 -10.06 -9.964 -9.864 ... 64.55 64.76 64.96
* rlat (rlat) float64 -23.38 -23.26 -23.16 -23.05 ... 21.61 21.73 21.83
* rlon (rlon) float64 -28.38 -28.26 -28.16 -28.05 ... 17.93 18.05 18.16
* time (time) datetime64[ns] 1951-05-01 1951-06-01 ... 1955-10-01
Data variables:
pr (time, rlat, rlon) float32 1.957e-18 1.957e-18 ... 1.432e-05
Attributes: (12/22)
CDI: Climate Data Interface version 1.3.2
Conventions: CF-1.6
NCO: 4.4.2
CDO: Climate Data Operators version 1.3.2 (htt...
contact: Fredrik Boberg, Danish Meteorological Ins...
creation_date: 2019-10-15 18:05:48
... ...
rcm_version_id: v1
project_id: CORDEX
CORDEX_domain: EUR-11
product: output
tracking_id: hdl:21.14103/a879aaf7-ddeb-436a-96fd-b717...
c3s_disclaimer: This data has been produced in the contex...
What I want to achieve now is take the mean of each month across all years. In other words, to have the average of may, june,..., october for the years between 1951 and 1955. So the time parameter should be equal to 6.
I actually found it. It was quite simple. You group by month and the you find the mean with the following lines of code:
out_ds = out_ds.groupby('time.month').mean()
The output is:
<xarray.Dataset>
Dimensions: (month: 6, rlat: 412, rlon: 424)
Coordinates:
lat (rlat, rlon) float64 21.99 22.03 22.07 22.11 ... 66.81 66.75 66.69
lon (rlat, rlon) float64 -10.06 -9.964 -9.864 ... 64.55 64.76 64.96
* rlat (rlat) float64 -23.38 -23.26 -23.16 -23.05 ... 21.61 21.73 21.83
* rlon (rlon) float64 -28.38 -28.26 -28.16 -28.05 ... 17.93 18.05 18.16
* month (month) int64 5 6 7 8 9 10
Data variables:
pr (month, rlat, rlon) float32 1.955e-18 1.955e-18 ... 2.622e-05
As expected the time parameter is equal to 6.