I have hourly data for maximum temperature from 1970-2022 for all month in NetCDF format (NetCDF of each month contains data of 24 hours from 1970-2022). Can anyone help me out in aggregating hourly data to daily data with R? I have already tried it by directly converting the hourly data to daily and downloading it but it is unable to download such large dataset. (here's the code that I had tried).
wf_set_key(service = "cds")
data=c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type': 'reanalysis',
'variable': 'maximum_2m_temperature_since_previous_post_processing',
'year': [
'1970', '1971', '1972',
'1973', '1974', '1975',
'1976', '1977', '1978',
'1979', '1980',
],
'month': [
'03','04',
'05', '06',
],
'day': [
'01', '02', '03',
'04', '05', '06',
'07', '08', '09',
'10', '11', '12',
'13', '14', '15',
'16', '17', '18',
'19', '20', '21',
'22', '23', '24',
'25', '26', '27',
'28', '29', '30','31',
],
'time': [
'00:00', '01:00', '02:00',
'03:00', '04:00', '05:00',
'06:00', '07:00', '08:00',
'09:00', '10:00', '11:00',
'12:00', '13:00', '14:00',
'15:00', '16:00', '17:00',
'18:00', '19:00', '20:00',
'21:00', '22:00', '23:00',
],
'area': [
38, 67, 6,
99
],
'format': 'netcdf',
},
'day_mean'=ct.climate.daily_mean(data,keep_attrs=True)
if count == 1:
day_mean_all=day_mean
else:
day_mean_all=ct.cube.concat([day_mean_all, day_mean], dim='time')
count = count + 1
return day_mean_all
'download.nc')
I am trying to aggregate the hourly data to daily of that month in R.
library(ncdf4)
ncpath <- "D:/MAX_TEMP/"
ncname <- "adaptor.mars.internal-1681202164.1038315-25242-15-2a718a58-dcd5-4470-9fd2-ddbdede30875_march"
ncfname <- paste(ncpath, ncname, ".nc", sep="")
ncin <- nc_open(ncfname)
print(ncin)
library(dplyr)
a1<-ncname %>%
group_by(time) %>%
summarize(Mean_Max_Temp = mean(expver))
#Error in UseMethod("group_by")
Please see the nice explanation given by Robert Hijmans here
I suppose you are familiar with raster brick function to read your netcdf file (Could use package terra as well, but stick to raster::brick for now). You will then need to group the layers by day-mon in other to aggregate over the time dimension, which is hourly in your netcdf.
Finally, use the stackApply () function as shown in the above link. Hope this helps!