I have a spatrast with information on current velocities which I calculated from u and v components.
How can I extract points from this SPATRAST to a dataframe based on time and spatial matches?
code for recreating example data:
library(stars)
library(lubridate)
library(terra)
# create a datetime vector holding 364 entries at one hour intervals####
datetime <- seq(as.POSIXct("2020-01-01 00:00:00"), as.POSIXct("2020-01-17 00:00:00"), by="hour")
datetime <- datetime[1:364]
# Create a sample rast ####
v <- rast(ncol=37, nrow=74, nlyr=364, extent=ext(-6.3875, -5.4875, 52.2375, 54.0625),
crs="+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs+ towgs84=0,0,0", time=datetime)
v <- init(v, runif)
# Create a sample df
time <- as.POSIXct(c("2020-01-01 15:21:00", "2020-01-14 08:01:00", "2020-01-16 21:19:00"))
lat <- c("53.92265", "52.86578", "53.38290")
long <-c("-6.026689", "-5.819075", "-6.028343")
sighting <- c("1", "0", "1")
df <- data.frame(sighting, lat, long, time)
DF<- st_as_sf(df, coords = c("long", "lat"))
DF = st_set_crs(DF, "EPSG:4326")
#DF = st_transform(DF, 2157) # CRS ITM
I tired to use st_extract from stars:
# save spatrast as a stars object
v.star <-st_as_stars(v)
#v.star = st_transform(v.star, 2157) # CRS ITM
# extract current velocities to DF
DF$cur_vel <- st_extract(v.star, DF, time_column = "time")
However, this returns the DF with time and geometry from the DF repeated and a cur_vel column with NA's:
It seems the function requires an exact match?
To circumvent this I used data.table to match the times from DF to the nearest time in the raster
# Drop geom of DF and make a data.table
DF <- st_transform(DF, 4326)
DF <- cbind(as.data.table(st_drop_geometry(DF)),
st_coordinates(DF))
# save all the times from the spatrast in a data.table
time <- time(v)
DT = data.table(
ID = 1:364,
time = time
)
# Extract the nearest rast times (DT/v) to the times in the DF
DF.rast.time <- DF[, c("RASTtime") :=
DT[DF, on = c("time"), roll = 'nearest', .(x.time)]][]
This produces the following data frame
And I then extracted the current velocity values from the raster but using the new RASTtimes in DF:
# save DF.rast.time as an sf object again
DF.rast.time<- st_as_sf(DF.rast.time, coords = c("X", "Y"))
DF.rast.time = st_set_crs(DF.rast.time, "EPSG:4326")
# extract current velocities to DF
DF.rast.time$cur_vel <- st_extract(v.star, DF.rast.time, time_column = "RASTtime") %>%
st_as_sf()
This produced the result I was looking for:
However, is it possible to stop the replicate columns of RASTtime and geometry from appearing in this new dataframe?
Also, is there a more straightforward way to find nearest matches in time between a dataframe and a raster?
Example data
library(terra)
datetime <- as.POSIXct("2020-01-01 00:00:00") + (0:363) * 3600
r <- rast(ncol=37, nrow=74, nlyr=364, extent=ext(-6.3875, -5.4875, 52.2375, 54.0625), time=datetime)
set.seed(0)
r <- init(r, runif)
time <- as.POSIXct(c("2020-01-01 15:21:00", "2020-01-14 08:01:00", "2020-01-15 21:19:00"))
lat <- c(53.92265, 52.86578, 53.38290)
long <-c(-6.026689, -5.819075, -6.028343)
sighting <- c("1", "0", "1")
df <- data.frame(sighting, lat, long, time)
The SpatRaster has time in steps of hours. So I round the time in the data.frame to the nearest hour, and then use match
to find the layer of interest, and use that in extract
.
m <- match(round(df$time, units="hours"), time(r))
e <- extract(r, df[, c("long", "lat")], layer=m)
e
# ID layer value
#1 1 lyr.16 0.4852105
#2 2 lyr.321 0.5526752
#3 3 lyr.358 0.8640279
Or like this, in two steps:
ee <- extract(r, df[, c("long", "lat")], ID=FALSE)
ee[cbind(1:length(m), m)]
# [1] 0.4852105 0.5526752 0.8640279
I think that match
ignores differences in time zones. Time zones are considered if you do
m <- sapply(df$time, \(i) which.min(abs(time(r)-i)))
That is nice as it avoids the rounding and is more general. But then you must make sure that they are correctly set for the SpatRaster and the data.frame.
For example
datetime <- as.POSIXct("2020-01-01 00:00:00", tz="America/Los_Angeles") + (0:363) * 3600
datetime[1]
# [1] "2020-01-01 PST"
See Sys.timezone()
for your time zone
Finally, if you need to transform the long/lat data to the crs of the SpatRaster you can do
v <- vect(df, c("long", "lat"), crs="+proj=longlat")
pv <- project(v, r)
e <- extract(r, pv, layer=m)