rdata.tablerasterterra

Merge tabular data with raster based on key value in raster cell ("left join")


I'd like to join tabular data to a raster using the current cell values as a key. Is there an way to do this with large rasters (100M- 1B cells)? Maybe there's something obvious in terra:: but nothing is coming to mind. See my current solution below:

# Load Libs
library(data.table)
library(terra)

# -------- example data ------------

# Create Raster 
r <- rast(matrix(c(1:24), nrow = 4, ncol = 6))
names(r) <- "r_key"

# Create tabular data
dt <- data.table(
  r_key = c(1:50), 
  vals = sample(c(1000:5000), 50)
)

# -------- end example data ----------

# -------- Current Solution ----------

# Clunky Solution: This has memory limitations as ncell() in raster grows. 
 
r_table <- data.table()
r_table[, r_key := values(r)]
# Make index ensure order for raster cells
r_table[, index := .I]

# Conventional Merge
r_table <- merge(r_table,
                 dt,
                 by = "r_key",
                 all.x = TRUE)

# Ensure order
setorder(r_table, "index")

# Raster with new vals
r_new <- deepcopy(r)
values(r_new) <- r_table[["vals"]]

plot(r_new)

# ---------- end current solution ------

One variation on the above was to skip merge() and use values(r_new) <- dt[r_table, , on = "r_key"][["vals"]], but I'm not sure if this puts me any further ahead as far as memory goes.

Maybe there's an obvious way to do a lookup on a vector or something clever with data.table? Or does something exist in terra?


Solution

  • You can do this:

    x <- classify(r, dt)
    

    or this (I do not know how well it would scale)

    levels(r) <- dt
    y <- as.numeric(r, 1)
    

    It is difficult to give a more complete answer without understanding the context a bit better. That is, why do you need to do this in the first place?