rnetcdfr-raster

Reading netcdf in R with lon lat dimensions reported as single 1D vector


I need to work with these data in R: https://zenodo.org/records/1209296 in particular "irrigation water use v2.7z". I opened the netcdf inside R both with terra and raster library. I should have a lon lat grid over the world with 480 layers (months). But the format when I open them is the following.

class      : RasterLayer 
dimensions : 480, 64028, 30733440  (nrow, ncol, ncell)
resolution : 0.007481219, 1  (x, y)
extent     : 0.9962594, 480.0037, 0.5, 480.5  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : withd_irr_h08.nc 
names      : withd_irr 
zvar       : withd_irr 

How can I move to a raster with two dimension (lat, lon) and 480 layers? The resolution is 0.5°.


Solution

  • Your data is completely non-standard and not self-descriptive. In other words, no library will give you a one-stop solution.

    With package ncdfCF I see the following file structure:

    > library(ncdfCF)
    
    > irr <- open_ncdf("~/withd_irr_h08.nc")
    > irr
    <Dataset> withd_irr_h08 
    Resource   : withd_irr_h08.nc 
    Format     : classic 
    Conventions: (not indicated) 
    Keep open  : FALSE 
    
    Variables:
     name      units    data_type axes           
     withd_irr mm/month NC_DOUBLE grid_num, month
     lon       degree   NC_DOUBLE grid_num       
     lat       degree   NC_DOUBLE grid_num       
    
    Axes:
     id axis name     length values                      unit               
     1  T    month      480  [1971-02-01 ... 2011-01-01] months since 1971-1
     0       grid_num 64028  [1 ... 64028]
    

    The data variable withd_irr does not use a longitude-latitude grid, instead it has an axis grid_num which holds 64,028 locations, same as the lon and lat variables (which you want to be axes if terra should recognise them, but they are not).

    The solution to your problem is to extract the data for the 3 variables, put them in a data.frame, then convert that to a terra::SpatRaster. You will have to do the conversion for each month individually because the various steps are available for matrices but not for arrays:

    # Read entire data arrays
    > lon <- irr[["lon"]][]
    > lat <- irr[["lat"]][]
    > withd <- irr[["withd_irr"]][]
    
    # Make a data.frame with lon, lat and all 480 months of withd_irr
    > df <- data.frame(lon, lat, withd, check.names = FALSE)
    
    # Convert a month of data to a SpatRaster
    > r <- rast(df[c("lon", "lat", "1971-02-01")])
    > r
    class       : SpatRaster 
    dimensions  : 279, 720, 1  (nrow, ncol, nlyr)
    resolution  : 0.5, 0.5  (x, y)
    extent      : -180, 180, -56, 83.5  (xmin, xmax, ymin, ymax)
    coord. ref. :  
    source(s)   : memory
    name        : 1971-02-01 
    min value   :    0.00000 
    max value   :   96.28087
    

    All months in one go

    Putting it all together and making a single SpatRaster:

    > library(ncdfCF)
    
    > irr <- open_ncdf("~/withd_irr_h08.nc")
    > lon <- irr[["lon"]][]
    > lat <- irr[["lat"]][]
    > withd <- irr[["withd_irr"]][]
    > df <- data.frame(lon, lat, withd, check.names = FALSE)
    > mon_str <- names(df)[-(1:2)] # drop lon, lat
    
    # Build the SpatRaster structure from the first month
    > r <- rast(df[c("lon", "lat", mon_str[1])])
    
    # Now add the remaining months
    > for (i in 2:length(mon_str)) add(r) <- rast(df[c("lon", "lat", mon_str[i])])
    > r
    class       : SpatRaster 
    dimensions  : 279, 720, 480  (nrow, ncol, nlyr)
    resolution  : 0.5, 0.5  (x, y)
    extent      : -180, 180, -56, 83.5  (xmin, xmax, ymin, ymax)
    coord. ref. :  
    source(s)   : memory
    names       : 1971-02-01, 1971-03-01, 1971-04-01, 1971-05-01, 1971-06-01, 1971-07-01, ... 
    min values  :    0.00000,     0.0000,     0.0000,     0.0000,     0.0000,      0.000, ... 
    max values  :   96.28087,   150.5525,   491.3734,   448.1106,   269.3665,    331.089, ...