rrasternetcdfterra

Extracting time from an unconventionally-named netcdf using terra


I have a netcdf file (ERA5 climate data) that I'm converting to a SpatRaster using terra. I would like to access the time data for each layer; however, the date element of the original file is named valid_time rather than something more traditional. Possibly because of this, it does not seem to be recognized by the terra::time argument.

Here is the code used to process the netcdf file. Apologies for the lack of reproducible example, this problem is specific to the raster and I'm not sure how to recreate it:

library(terra)
ncfile <- 'filepath'
sample_rast <- terra::rast(ncfile)

sample_rast
class       : SpatRaster 
dimensions  : 1801, 3600, 36  (nrow, ncol, nlyr)
resolution  : 0.1, 0.1  (x, y)
extent      : -0.05, 359.95, -90.05, 90.05  (xmin, xmax, ymin, ymax)
coord. ref. :  
source      : demo4.nc:t2m 
varname     : t2m (2 metre temperature) 
names       : t2m_v~70400, t2m_v~48800, t2m_v~68000, t2m_v~46400, t2m_v~38400, t2m_v~16800, ... 
unit        :           K,           K,           K,           K,           K,           K, ... 
#this is where time appears in examples; it is not present in my raster

sample_dates <- terra::time(sample_rast)
sample_dates

[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

I can access the time data using ncdf4, using the following code:

library(ncdf4)
nc <- nc_open(ncfile)
date_array <- ncvar_get(nc, "valid_time")
nc_close(nc)

This makes me suspect that it's a naming problem, but I don't see anything in the doc for terra::time that would help me direct it to look for "valid_time".

I can use the date array from ncdf4, obviously, but I'd like to see if there's a way to get this same information through terra.


Solution

  • ERA5 files use "time" as the name for the temporal axis of the data so I am curious to learn what specific ERA5 data you have to work with. Is this maybe the new netCDF format used by ECMWF for hourly data in a beta release?

    With package ncdfCF you can read netCDF files including the attributes that define much of what is going on inside. That latter part is quite important because the CF Metadata Conventions that many netCDF files adhere to (probably most, as it is the recommended standard) use the attributes to describe many important aspects of the data variables, including its "time" dimension. Handily, ncdfCF can also produce a terra::SpatRaster for you:

    library(ncdfCF)
    
    # Open the file
    nc <- open_ncdf(nc_file)
    
    # Read the data variable of interest
    (t2m <- nc[["t2m"]]$data())
    > <Data> t2m 
    > Long name: 2 metre temperature 
    > 
    > Values: [273.4626 ... 318.3601] K
    >     NA: 14843752 (9.1%)
    > 
    > Axes:
    >  id axis name      length values                      unit                             
    >  0  X    longitude    78  [100 ... 107.7]             degrees_east                     
    >  1  Y    latitude     87  [22.5 ... 13.9]             degrees_north                    
    >  2  T    time      24097  [2019-01-01 ... 2021-10-01] hours since 1900-01-01 00:00:00.0
    > 
    > Attributes:
    >  id name          type      length value               
    >  0  scale_factor  NC_DOUBLE  1     0.000685112247836968
    >  1  add_offset    NC_DOUBLE  1     295.911034397001    
    >  2  _FillValue    NC_SHORT   1     -32767              
    >  3  missing_value NC_SHORT   1     -32767              
    >  4  units         NC_CHAR    1     K                   
    >  5  long_name     NC_CHAR   19     2 metre temperature 
    
    # Make a SpatRaster
    r <- t2m$terra()
    
    # ncdfCF puts the 3rd dimension dimnames as layer names. In the case of single-layer 
    # ERA5 files that is the "time" dimension. The "time" property itself is not set.
    names(r)[1:10]
    >  [ 1] "2019-01-01 00:00:00" "2019-01-01 01:00:00" "2019-01-01 02:00:00"
    >  [ 4] "2019-01-01 03:00:00" "2019-01-01 04:00:00" "2019-01-01 05:00:00"
    >  [ 7] "2019-01-01 06:00:00" "2019-01-01 07:00:00" "2019-01-01 08:00:00"
    >  [10] "2019-01-01 09:00:00"
    has.time(r)
    > FALSE
    
    # Set the time information in the SpatRaster. Note that this should be a POSIXt
    # but I get an error when using that
    time(r) <- as.Date(names(r))
    has.time(r)
    > TRUE
    
    time(r)[1:10]
    >  [1] "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"
    >  [6] "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"
    

    A few things to note: