rterrarasterize

Rasterize points using the terra package to produce multi-layer SpatRaster


I have some long-standing package code that uses raster::rasterize that I'm trying to update to terra::rasterize. The code takes point data, where each point has one of two possible integer ID values. The output is a raster with two layers, one for each possible point ID, where cell values are counts. The relevant bits are:

  # r0 is template raster to define extent and resolution
  r <- raster::rasterize(dat[, c("X", "Y")],
                         r0,
                         field = dat$flightlineID,
                         fun = f,
                         background = 0)

Here, f is a function that takes a vector of point IDs and returns a two-element vector of counts, which results in the desired two layer output raster.

My first attempt to port this to terra::rasterize (package version 1.6-17) was...

r <- terra::rasterize(cbind(dat$X, dat$Y), # seem to need a matrix rather than a data frame
                      r0,  # template SpatRaster
                      values = dat$flightlineID, 
                      fun = f, 
                      background = 0)

This fails with the error:

Error in w[vv[, 1], ] <- vv[, -1] : number of items to replace is not a multiple of replacement length

Delving into the code for terra:::rasterize_points it seems that the number of layers for the output raster is determined by treating the 'values' argument as a data frame and checking the number of columns. This is a bit confusing because the package docs state that the values argument is expected to be a numeric vector, of either length 1 or nrow(x) where x is the input point data. Moreover, the length of the vector returned by the user-supplied summary function doesn't seem to play any part in determining the number of output raster layers.

For the moment I've simply retained the old raster::rasterize code and convert the output raster to a SpatRaster, but I think I must be missing something obvious. Is there a way of using just terra::rasterize to accomplish this task?

EDIT: As requested in comments, here is a small sample of the input point data to show the format. Typical input data sizes range from 2 to 40 million points.

structure(list(X = c(420094, 420067, 420017, 420050, 420058, 
420090, 420038, 420040, 420081, 420097, 420075, 420041, 420039, 
420062, 420050, 420083, 420019, 420019, 420044, 420087, 420099, 
420077, 420030, 420014, 420015, 420051, 420033, 420056, 420041, 
420030, 420027, 420024, 420058, 420042, 420063, 420028, 420073, 
420053, 420010, 420100, 420048, 420062, 420056, 420080, 420053, 
420068, 420074, 420004, 420010, 420078), Y = c(6676049, 6676029, 
6676034, 6676019, 6676096, 6676010, 6676003, 6676048, 6676073, 
6676023, 6676089, 6676082, 6676010, 6676051, 6676039, 6676099, 
6676024, 6676073, 6676040, 6676056, 6676072, 6676086, 6676030, 
6676042, 6676002, 6676033, 6676078, 6676073, 6676013, 6676056, 
6676055, 6676069, 6676072, 6676089, 6676069, 6676058, 6676023, 
6676039, 6676043, 6676017, 6676011, 6676054, 6676095, 6676068, 
6676098, 6676077, 6676049, 6676073, 6676097, 6676057), flightlineID = c(2L, 
1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 
2L)), row.names = c(NA, -50L), class = "data.frame")

EDIT: In the raster package code, the private .pointsToRaster function has a line (see here) where the length of the output from the user-supplied summary function is checked with some arbitrary test values to determine the number of layers in the output raster. This seems to be absent from the terra package code.


Solution

  • It may be that you don't want this as two layers in one raster, though this is hard to tell with the supplied data as it appears to be all 'within' the overlap. I notice in you package, there is an attempt to throttle/reduce tile edge points that maybe just needs to be set lower than 1K.

    That terra doesn't work the same as raster when rasterize(ing may be a decision that under terra one should intend two layers via making each then add<-ing or <- c(ing, whereas with raster it was assumed via a hard to follow logic of 'field' and 'values'. Using your above data (and keeping two rasters):

    library(terra) 
    #las_df <- structure(...)
    las_df1 <- las_df[which(las_df$flightlineID == 1L), ]
    las_df2 <- las_df[which(las_df$flightlineID == 2L), ]
    las_vect1 <- vect(las_df1, geom = c('X', 'Y'), crs = 'EPSG:32755')
    las_vect2 <- vect(las_df2, geom = c('X', 'Y'), crs = 'EPSG:32755')
    las_rast <- rast(xmin=0, nrow = length(unique(las_df$X)), ncol = length(unique(las_df$Y)), crs='EPSG:32755')
    set.ext(las_rast, c(min(las_df$X), max(las_df$X), min(las_df$Y), max(las_df$Y)))
    pts1_rast <- rasterize(las_vect1, las_rast, fun = length)
    pts2_rast <- rasterize(las_vect2, las_rast, fun = length)
    pts1_pts2_rast <- c(pts1_rast, pts2_rast)
    names(pts1_pts2_rast) <- c('lyr.1', 'lyr.2') # have to attend to this as both lyr.1 after `c(`
    plot(pts1_pts2_rast$lyr.1, col = 'red')
    plot(pts1_pts2_rast$lyr.2, col = 'blue', alpha=.75, add = TRUE)
    # there is 1 cell that contains points from both pts1_rast and pts2_rast
     cells(pts1_rast) %in% cells(pts2_rast)
     [1] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
     cells(pts2_rast) %in% cells(pts1_rast)
     [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    [25] FALSE FALSE FALSE
    

    One might suggest a consistent merge policy where pts1 or pts2 are always favored. In the end, if this is about optimizing allocation of scarce resources, clear bush where you have the best data, inspect, and clear again. But it still seems best to resolve this at the las level upstream.