rarraysnetcdf

CMIP6 How to merge multiple .nc files in R


I want thank everyone for taking the time to answer this question. I cannot seem to find how to merge 2 or more .nc files in R--this is after a solid few days of trying different solns from here, youtube, books, and speaking with colleagues etc..

In this example, I have 10 files from a single climate model containing temperature data from 1850 to 1949. Each file contains georeferenced data for 9 years. I want monthly avg temps for a single month across all years in a single .nc file. See example code below:

library(ncdfCF)

####I am testing 2 months, then will include all the files once I know it works. 
Basic idea, open files, create array, average across years for a given month, 
then merge the output and convert to raster for further processing.###

ds1 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_185001-185912.nc")
ds2 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_186001-186912.nc")
ds3 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_187001-187912.nc")
ds4 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_188001-188912.nc")
ds5 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_189001-189912.nc")
ds6 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_190001-190912.nc")
ds7 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_191001-191912.nc")
ds8 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_192001-192912.nc")
ds9 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_193001-193912.nc")
ds10 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_194001-194912.nc")

ds1 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_185001-185912.nc")
ts <- ds1[["ts"]]
d <- ts$data()
d
t <- d$axes[["time"]]
t
arr <- d$array()
str(arr)
cft <- t$values
fac <- CFfactor(cft, epoch = 1850:1859)
res <- apply(arr, 1:2, tapply, fac, mean)
res <- aperm(res, c(2, 3, 1))
str(res)

### Looking good:

> str(res)
 num [1:192, 1:288, 1:12] 234 234 235 235 235 ...
 - attr(*, "dimnames")=List of 3
  ..$ lat: chr [1:192] "90" "89.05759" "88.11518" "87.17277" ...
  ..$ lon: chr [1:288] "0" "1.25" "2.5" "3.75" ...
  ..$    : chr [1:12] "01" "02" "03" "04" ...

###


ds2 <- open_ncdf("ts_Amon_SAM0-UNICON_historical_r1i1p1f1_gn_186001-186912.nc")
ts <- ds2[["ts"]]
d <- ts$data()
d
t <- d$axes[["time"]]
t
arr <- d$array()
str(arr)
cft <- t$values
fac <- CFfactor(cft, epoch = 1860:1869)
res2 <- apply(arr, 1:2, tapply, fac, mean)
res2 <- aperm(res2, c(2, 3, 1))
str(res2)

###Also checks out:

> str(res2)
 num [1:192, 1:288, 1:12] 234 235 235 235 236 ...
 - attr(*, "dimnames")=List of 3
  ..$ lat: chr [1:192] "90" "89.05759" "88.11518" "87.17277" ...
  ..$ lon: chr [1:288] "0" "1.25" "2.5" "3.75" ...
  ..$    : chr [1:12] "01" "02" "03" "04" ...
###

##To merge the separate datasets, I am trying to use abind###


test<-abind(res, res2, rev.along=0)
str(test)

###Looks good, I see 12 months listed (previous iterations/attempts gave me 24 time steps
 using rev.along=1 or along=3). BUT, I now have NULL as an added dimension.

> str(test)
 num [1:192, 1:288, 1:12, 1:2] 234 234 235 235 235 ...
 - attr(*, "dimnames")=List of 4
  ..$ : chr [1:192] "90" "89.05759" "88.11518" "87.17277" ...
  ..$ : chr [1:288] "0" "1.25" "2.5" "3.75" ...
  ..$ : chr [1:12] "01" "02" "03" "04" ...
  ..$ : NULL
###

res.jan <- test[, , 1, NULL]
Error in test[, , 1, NULL] : incorrect number of dimensions
 
> res.jan <- test[, , 1, ,]
Error in test[, , 1, , ] : incorrect number of dimensions

###and purely for learning/curiousity:###

> res.jan <- test[1, 1, 1, NULL] 
> resr.jan<- raster(t(res.jan), xmn=min(lon), xmx=max(lon), ymn=min(lat), ymx=max(lat), crs=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs+ towgs84=0,0,0"))
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'extent': object 'lon' not found

###Incidentally, I have run into this last error when working with other files that did not
 need to be merged, sometimes I get the error, sometimes I don't. Not sure what seems to 
cause it. This last point/error of a missing 'lon' value is secondarily important to my
 primary issue of merging the files; I brought it up because I would like to learn how to 
fix/debug for the future. Closing R, reopening, rm=(), then crossing fingers and running
 code sometimes works and sometimes doesn't, but CLEARLY is a newb solution and less than
 optimal.###
 

THANKS!! Any ideas? Thanks.


Solution

  • Following on your previous question and the accepted answer there, you could simply bind the arrays from the function like so:

    # `res` is the output from the function in the previously accepted answer
    test <- abind(res, along = 3)
    

    If you use a full set of data files from 1850 to 1949 in the solution suggested in the previous answer, the processCMIP6model() function, then this last abind() function call will glue everything together along the "time" dimension over the century from 1850.

    You should not use abind() as you do as you will get a 4th dimension with 12 months for each file that you process, instead you want to merge them into the 3rd dimension.