rterra

Fast renaming and merging of attributes for a raster


Using the terra package in R, I've got a raster and an associated attribute table that links the raster values to differently partitioned categories.

# generate a raster with values
set.seed(0)
raster_size             <- 20
example_raster          <- terra::rast(nrow=raster_size, ncols=raster_size)
example_raster_values   <- sample(8, ncell(example_raster), replace=TRUE)        # replace = TRUE to reuse values
values(example_raster)  <- example_raster_values

# generate an attribute table with different categories and sub categories
example_attribute_table <-
  data.frame(id        = 1:8,
             category1 = c("Haplic Acrisols", "Haplic Acrisols (Alumic)", "Haplic Acrisols (Ferric)", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
             category2 = c("Haplic Acrisols", "Haplic Acrisols", "Haplic Acrisols", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
             category3 = c("Acrisols", "Acrisols", "Acrisols", "Acrisols", "Acrisols", "Albeluvisols", "Albeluvisols", "Albeluvisols")
             )

# substitute raster values
raster_cat1 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category1)
raster_cat2 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category2)
raster_cat3 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category3)

# stack rasters and change names
raster_stack        <- c(example_raster, raster_cat1, raster_cat2, raster_cat3)
names(raster_stack) <- c("id", "category1", "category2", "category3")

I want to be able to plot (and post-process) these categories.

plot(raster_stack)

enter image description here

Note that id and category1 are isomorphic (only renaming); category2 and category3 merge some of the ids

While this works, I find it surprisingly slow for large rasters (e. g. takes ~20 minutes for a 400 MB raster), considering it's conceptually no more than a renaming/merge of a few attributes.

I'm wondering if it's possible to make this much faster by using a different file format like geopackage, or by using categorical rasters. Maybe something that doesn't change the raster pixel by pixel, like terra::subst() and terra::reclassify() seem to do.

I was able to make this effectively "instantaneous" for the renaming case, i. e. when I plot this, the legend contains the labels of category1 (rather than those of id).

levels(example_raster) <- example_attribute_table 
plot(example_raster)

Are there any other fast options for renaming/merging categories?


Solution

  • There is indeed no need to reclassify the raster cell values. You can set the categories table to the SpatRaster and then change the "active category" in a for-loop.

    Your example data

    library(terra)
    set.seed(0)
    r <- terra::rast(nrow=20, ncols=20)
    values(r) <- sample(8, ncell(r), replace=TRUE)
    
    att_table <- data.frame(
        id = 1:8,
        category1 = c("Haplic Acrisols", "Haplic Acrisols (Alumic)", "Haplic Acrisols (Ferric)", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
        category2 = c("Haplic Acrisols", "Haplic Acrisols", "Haplic Acrisols", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
        category3 = c("Acrisols", "Acrisols", "Acrisols", "Acrisols", "Acrisols", "Albeluvisols", "Albeluvisols", "Albeluvisols")
    )
    

    Solution

    levels(r) <- att_table
    
    par(mfrow=c(2,2))
    for (i in 1:4) {
        activeCat(r) <- i-1
        plot(r, mar=c(1,1,0,10))
    }
    

    enter image description here

    You can also do

    levels(r) <- att_table
    x <- rast(lapply(1:4, \(i) {activeCat(r) <- i-1; r}))
    plot(x)
    
    # extract values
    x[9:10]
    #  id                category1           category2    category3
    #3  3 Haplic Acrisols (Ferric)     Haplic Acrisols     Acrisols
    #6  6      Haplic Albeluvisols Haplic Albeluvisols Albeluvisols
    

    But the most efficient way to extract all the attributes would be

    e <- extract(r, 9:10, raw=TRUE)
    # if the attributes are not ordered 1:n
    # e <- match(e, att_table$id)
    att_table[e,]
    #  id                category1           category2    category3
    #3  3 Haplic Acrisols (Ferric)     Haplic Acrisols     Acrisols
    #6  6      Haplic Albeluvisols Haplic Albeluvisols Albeluvisols