rgisterra

How to speed up getting corresponding cells of one raster that overlap another raster when data are large/high resolution


I have a high resolution raster (highRes), and a lower resolution raster (lowRes), and I would like to return, for every low resolution cell, the values and the percent coverage of the high resolution cells, in each low resolution cell.

I can do this with the following code, and it works. Here, I go cell by cell, converting each cell to a polygon in order to determine which highRes cells are intersected by this polygon. But it is incredibly slow with large rasters. For the project I'm working on, the highRes raster is 1x1km global, and the lowRes raster is 25x25km in resolution.

I'm wondering if there is a faster way to go about this.

library(terra)

highRes <- rast()
res(highRes) <- c(0.1, 0.1)
highRes[] <- sample(1:10, ncell(highRes), replace = TRUE)
lowRes <- rast(res = c(1, 1), xmin = -170, xmax = -50, ymin = 0, ymax = 80)
lowRes[] <- 1
lowRes <- shift(lowRes, dx = 0.02, dy = 0.03) # not aligned



for (i in 1:ncell(lowRes)) {
    
    xx <- rast(lowRes, vals = NA)
    xx[i] <- 1
    p <- as.polygons(xx)
    p <- project(p, crs(highRes))
    e <- extract(highRes, p, weights = TRUE, exact = TRUE)  

}

This gives me the cell values that overlap with the lowRes cell, and the percent coverage of those cells.


Solution

  • Create a single polygon SpatVector and/or sf with all cells represented, then run extract() on a crop()ed version of highRes. You can also use exactextractr::exact_extract(), although it does not work with SpatVector objects and it generates a list object. However, it is much faster than extract().

    library(terra)
    library(sf)
    library(exactextractr)
    
    # Example SpatRaster data
    highRes <- rast(ext(-180, 180, -90, 90), res = 1000/111320, crs = "EPSG:4326")
    set.seed(1)
    highRes [] <- sample(1:10, ncell(highRes), replace = TRUE)
    
    lowRes <- rast(ext(-170, -50, 0, 80), res = 25000/111320, crs = crs(highRes))
    lowRes <- shift(lowRes, dx = 0.02, dy = 0.03)
    
    # Add individual value to each cell
    lowRes[] <- 1:ncell(lowRes)
    
    # Extend lowRes for cropping highRes (to account for shift)
    lowRes_c <- rast(extend(ext(lowRes), c(1,1)), crs = crs(highRes))
    
    # Crop highRes
    highRes_c <- crop(highRes, lowRes_c)
    
    # Convert lowres to SpatVector for terra::extract()
    v_poly <- as.polygons(lowRes)
    
    # Convert lowres to sf for exactextractr::exact_extract()
    sf_poly <- v_poly |>
      st_as_sf()
    
    # Extract values from highRes_c using sf_poly (dataframe)
    st <- Sys.time()
    e <- extract(highRes_c, v_poly, weights = TRUE, exact = TRUE)
    Sys.time() - st
    # Time difference of 1.303376 hours
    
    nrow(e)
    # 128510304
    
    head(e) 
    #   ID lyr.1     weight
    # 1  1     8 0.08010511
    # 2  1     3 0.13965326
    # 3  1     7 0.13965326
    # 4  1     9 0.13965326
    # 5  1     2 0.13965326
    # 6  1     7 0.13965326
    
    # Extract values from highRes_c using sf_poly (list object)
    st <- Sys.time()
    e1 <- exact_extract(highRes_c, sf_poly, weights = "area")
    Sys.time() - st
    # Time difference of 4.259271 mins
    
    head(e1[1][[1]])
    #   value   weight coverage_fraction
    # 1     8 173939.9        0.08007456
    # 2     3 173939.9        0.13959999
    # 3     7 173939.9        0.13959999
    # 4     9 173939.9        0.13959999
    # 5     2 173939.9        0.13959999
    # 6     7 173939.9        0.13959999
    

    Note the values differ between the two methods. The exactextractr devs state that this method is more accurate than other approaches, see the exactextractr GitHub Accuracy section for details.