rrasterterrarasterize

rasterize and mask in one step?


Toy data

library(terra)

template_raster <- rast(xmin=0, xmax=10, ymin=0, ymax=10, crs="+proj=longlat +datum=WGS84 +no_defs +type=crs", resolution = 0.1)
values(template_raster) <- c(rep(NA,4000),rep(10,2000),rep(NA,4000))

polygon_to_rasterize <- as.polygons(ext(3, 6, 3, 6), crs="+proj=longlat +datum=WGS84")
polygon_to_rasterize$polygon_val <- 20

plot(template_raster)
plot(polygon_to_rasterize, add=T)

enter image description here

Problem

I want to rasterize my polygon, but only in the area where there is data in my template raster, also I want my resultant raster to have the same dimensions as the template. I can do it in two steps here I think.

rasterized_polygon <- rasterize(polygon_to_rasterize, template_raster, filename = "temp.tif")
rasterized_polygon_only_where_template_data <- mask(rasterized_polygon, template_raster, filename = "temp2.tif")

plot(rasterized_polygon_only_where_template_data)

The result looks correct. But as I have to do this for over 50,000 layers, I am hoping it can be done in one line, rather than writing an intermediate file (temp.tif). Any ideas please?


Solution

  • Your example data (with more agreeable variable names)

    library(terra)
    
    r <- rast(xmin=0, xmax=10, ymin=0, ymax=10, crs="+proj=longlat +datum=WGS84 +no_defs +type=crs", resolution = 0.1)
    values(r) <- c(rep(NA,4000),rep(10,2000),rep(NA,4000))
    p <- as.polygons(ext(3, 6, 3, 6), crs="+proj=longlat +datum=WGS84")
    p$val <- 20
    

    Here are three approaches.

    a)

    x <- rasterize(p, r) |>  mask(r)
    

    b)

    y <- rasterize(p, r, "val") |> mask(r)
    

    c)

    z <- mask(r, p)
    

    (a) is your two-step approach. The values of x are 1. (b) is a slightly modified approach to transfer the values of p to the raster (the values of y are 20) (c) is in one step, and the result is similar in that it gets you the same area (cells with a value; and this is what Allan Cameron suggested). The remaining values of z are what they were in r (10).

    It is not entirely clear if the output cell values matter to you; or what you would like them to be.

    I understand it could be nice to mask and change the values in one step, but it may be better to have two methods with clear arguments that are easy to understand?

    If you are working with many rasters with the same extent and resolution, but only one polygon, then it would be much more efficient to rasterize the polygons once and use that raster for masking.

    Of course, you are not required to write the intermediate rasters to files, but I assume that you are working with large datasets and these will be written automatically if you do not do so yourself.