I have daily geo-spatial data over a period of 2 years. The data is in two nc files, each containing 365 days' worth of data. I want to apply a calculation to the data, and then calculate the monthly mean of this for each 12 months of each year across each gridbox.
lons <- 584
lats <- 712
MyValues_ym <- array(0., c(lons, lats, 2, 12))
MyCalculation <- array(0., c(lons, lats, ts)) ## set up data array to fit dailycalculations
MyMonths = c('01','02','03','04','05','06','07','08','09','10','11','12')
ts <- dim(P)[[3]] ## set MyCalculation array coordinates the same as precipitation coordinates
## Dimension of precipitation coordinates: [584L, 712L, 365L]
## calculate MeanMonthly #calculated monthly q0
## Loop over months
for (m in 1:12) {
MyValues_ym[, , y - 1982, m] <- apply(MyCalculation, c(2), mean)
}
This is giving the error:
Error in `[<-`(`*tmp*`, , , y - 1982, m, value = c(NA_real_, NA_real_, :
subscript out of bounds
Any help would be greatly appreciated.
I have tried to use rowMeans
, but this was not calculating the mean for the time dimension of my array. I am not sure how to calculate the mean for the particular dimension required.
I tried this with RowMeans:
# Loop over months
for (m in 1:12) {
my_months_yearly <- format(daterange, '%m') == MyMonths[m]
MyValues_ym[,,y-1982,m]=rowMeans(Q0d[,,my_months_yearly], na.rm = TRUE, dims = 2)
EDIT
I have now made a MRE:
lons <- 10
lats <- 10
MyYearlyMonthlyData <- array(0., c(lons,lats,1,12)) ## set up array for monthly calculation
MyCalculationYr1 <- array(runif(1:100), c(lons, lats, 365)) ## set up data array to fit dailycalculations
## set up the date range and months for the loop
start = '1985-01-01'
end = '1985-12-31'
daterange = seq(as.Date(start), as.Date(end), "days")
MyMonths = c('01','02','03','04','05','06','07','08','09','10','11','12')
y=1985
## the loop to generate monthly values of each day of each month for the year
for (m in 1:12) {
blMM <- format(daterange,'%m') == MyMonths[m]
MyYearlyMonthlyData[,,y-1984,m] = rowMeans(MyCalculationYr1[,,blMM],na.rm=TRUE,dims=2)
}
The precipitation data is from the TAMSAT archive (as per your comment under your post). I quickly downloaded a file and it conforms to the CF Metadata Conventions, as is common for meteorological data distributed through NetCDF files. You can check that easily by checking for the "Conventions" global attribute:
nc <- nc_open("./rfe_file.nc")
ncatt_get(nc, "")$Conventions
These files typically contain 3-d arrays of data, as is the case here. Dimensions are lon
, lat
and time
. You can easily work with the time
dimension using the CFtime package:
library(CFtime)
# Create a CFtime instance from the "time" dimension in your file
cf <- CFtime(nc$dim$time$units, nc$dim$time$calendar, nc$dime$time$vals)
# Create a factor over the months in your data file
mon <- CFfactor(cf, "month")
# Extract the data from the file
data <- ncvar_get(nc, "ref_filled")
# Aggregate to monthly mean (mm/day)
pr <- apply(data, 1:2, tapply, mon, mean)
# Re-arrange the dimension after tapply() mixed them up
pr <- aperm(pr, c(2, 3, 1))
# Label the dimensions
dimnames(pr) <- list(nc$dim$lon$vals, nc$dim$lat$vals, levels(mon))
nc_close(nc)
Note that you will now have a 3-d array with the 3rd dimension having 24 elements with labels such as "2021-01", "2021-02" ... "2022-12". The apply()
function applied the function tapply()
over the first two dimensions, with tapply()
applying the mean()
function using the factor mon
, so in the result every grid cell has the monthly mean values for the two years.