rextractterratemporalr-stars

Extracting points from a raster by matching in time and space


I have a spatrast with information on current velocities which I calculated from u and v components.

How can I extract points from this SPATRAST to a dataframe based on time and spatial matches?

code for recreating example data:

library(stars)
library(lubridate)
library(terra)

# create a datetime vector holding 364 entries at one hour intervals####
datetime <- seq(as.POSIXct("2020-01-01 00:00:00"), as.POSIXct("2020-01-17 00:00:00"), by="hour")
datetime <- datetime[1:364]

# Create a sample rast ####
v <- rast(ncol=37, nrow=74, nlyr=364, extent=ext(-6.3875, -5.4875, 52.2375, 54.0625), 
          crs="+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs+ towgs84=0,0,0", time=datetime)
v <- init(v, runif)
 
# Create a sample df 
time <- as.POSIXct(c("2020-01-01 15:21:00", "2020-01-14 08:01:00", "2020-01-16 21:19:00"))
lat <- c("53.92265", "52.86578", "53.38290")
long <-c("-6.026689", "-5.819075", "-6.028343")
sighting <- c("1",  "0", "1")
df <- data.frame(sighting, lat, long, time)
DF<- st_as_sf(df, coords = c("long", "lat")) 
DF = st_set_crs(DF, "EPSG:4326") 
#DF = st_transform(DF, 2157) # CRS ITM

I tired to use st_extract from stars:

# save spatrast as a stars object
v.star <-st_as_stars(v)
#v.star = st_transform(v.star, 2157) # CRS ITM

# extract current velocities to DF
DF$cur_vel <- st_extract(v.star, DF, time_column = "time")

However, this returns the DF with time and geometry from the DF repeated and a cur_vel column with NA's: enter image description here

It seems the function requires an exact match?

To circumvent this I used data.table to match the times from DF to the nearest time in the raster

# Drop geom of DF and make a data.table
DF <- st_transform(DF, 4326)
DF <- cbind(as.data.table(st_drop_geometry(DF)),
                      st_coordinates(DF))

# save all the times from the spatrast in a data.table
time <- time(v)
DT = data.table(
  ID = 1:364,
  time = time
)

# Extract the nearest rast times (DT/v) to the times in the DF
DF.rast.time <- DF[, c("RASTtime") :=
                     DT[DF, on = c("time"), roll = 'nearest', .(x.time)]][]

This produces the following data frame enter image description here

And I then extracted the current velocity values from the raster but using the new RASTtimes in DF:

# save DF.rast.time as an sf object again
DF.rast.time<- st_as_sf(DF.rast.time, coords = c("X", "Y")) 
DF.rast.time = st_set_crs(DF.rast.time, "EPSG:4326") 

# extract current velocities to DF
DF.rast.time$cur_vel <- st_extract(v.star, DF.rast.time, time_column = "RASTtime") %>%
  st_as_sf()

This produced the result I was looking for: enter image description here

However, is it possible to stop the replicate columns of RASTtime and geometry from appearing in this new dataframe?

Also, is there a more straightforward way to find nearest matches in time between a dataframe and a raster?


Solution

  • Example data

    library(terra)
    datetime <- as.POSIXct("2020-01-01 00:00:00") + (0:363) * 3600
    r <- rast(ncol=37, nrow=74, nlyr=364, extent=ext(-6.3875, -5.4875, 52.2375, 54.0625), time=datetime)
    set.seed(0)
    r <- init(r, runif)
     
    time <- as.POSIXct(c("2020-01-01 15:21:00", "2020-01-14 08:01:00", "2020-01-15 21:19:00"))
    lat <- c(53.92265, 52.86578, 53.38290)
    long <-c(-6.026689, -5.819075, -6.028343)
    sighting <- c("1",  "0", "1")
    df <- data.frame(sighting, lat, long, time)
    

    The SpatRaster has time in steps of hours. So I round the time in the data.frame to the nearest hour, and then use match to find the layer of interest, and use that in extract.

    m <-  match(round(df$time, units="hours"), time(r))
    e <- extract(r, df[, c("long", "lat")], layer=m)
    e
    #  ID   layer     value
    #1  1  lyr.16 0.4852105
    #2  2 lyr.321 0.5526752
    #3  3 lyr.358 0.8640279
    

    Or like this, in two steps:

    ee <- extract(r, df[, c("long", "lat")], ID=FALSE)
    ee[cbind(1:length(m), m)]
    # [1] 0.4852105 0.5526752 0.8640279
    

    I think that match ignores differences in time zones. Time zones are considered if you do

    m <- sapply(df$time, \(i) which.min(abs(time(r)-i)))
    

    That is nice as it avoids the rounding and is more general. But then you must make sure that they are correctly set for the SpatRaster and the data.frame.

    For example

    datetime <- as.POSIXct("2020-01-01 00:00:00", tz="America/Los_Angeles") + (0:363) * 3600
    datetime[1]
    # [1] "2020-01-01 PST"
    

    See Sys.timezone() for your time zone

    Finally, if you need to transform the long/lat data to the crs of the SpatRaster you can do

    v <- vect(df, c("long", "lat"), crs="+proj=longlat")
    pv <- project(v, r)
    e <- extract(r, pv, layer=m)