Using the terra package in R, I've got a raster and an associated attribute table that links the raster values to differently partitioned categories.
# generate a raster with values
set.seed(0)
raster_size <- 20
example_raster <- terra::rast(nrow=raster_size, ncols=raster_size)
example_raster_values <- sample(8, ncell(example_raster), replace=TRUE) # replace = TRUE to reuse values
values(example_raster) <- example_raster_values
# generate an attribute table with different categories and sub categories
example_attribute_table <-
data.frame(id = 1:8,
category1 = c("Haplic Acrisols", "Haplic Acrisols (Alumic)", "Haplic Acrisols (Ferric)", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
category2 = c("Haplic Acrisols", "Haplic Acrisols", "Haplic Acrisols", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
category3 = c("Acrisols", "Acrisols", "Acrisols", "Acrisols", "Acrisols", "Albeluvisols", "Albeluvisols", "Albeluvisols")
)
# substitute raster values
raster_cat1 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category1)
raster_cat2 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category2)
raster_cat3 <- terra::subst(example_raster, example_attribute_table$id, example_attribute_table$category3)
# stack rasters and change names
raster_stack <- c(example_raster, raster_cat1, raster_cat2, raster_cat3)
names(raster_stack) <- c("id", "category1", "category2", "category3")
I want to be able to plot (and post-process) these categories.
plot(raster_stack)
Note that id
and category1
are isomorphic (only renaming); category2
and category3
merge some of the ids
While this works, I find it surprisingly slow for large rasters (e. g. takes ~20 minutes for a 400 MB raster), considering it's conceptually no more than a renaming/merge of a few attributes.
I'm wondering if it's possible to make this much faster by using a different file format like geopackage, or by using categorical rasters. Maybe something that doesn't change the raster pixel by pixel, like terra::subst()
and terra::reclassify()
seem to do.
I was able to make this effectively "instantaneous" for the renaming case, i. e. when I plot this, the legend contains the labels of category1
(rather than those of id
).
levels(example_raster) <- example_attribute_table
plot(example_raster)
Are there any other fast options for renaming/merging categories?
There is indeed no need to reclassify the raster cell values. You can set the categories table to the SpatRaster and then change the "active category" in a for-loop.
Your example data
library(terra)
set.seed(0)
r <- terra::rast(nrow=20, ncols=20)
values(r) <- sample(8, ncell(r), replace=TRUE)
att_table <- data.frame(
id = 1:8,
category1 = c("Haplic Acrisols", "Haplic Acrisols (Alumic)", "Haplic Acrisols (Ferric)", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
category2 = c("Haplic Acrisols", "Haplic Acrisols", "Haplic Acrisols", "Plinthic Acrisols", "Vetic Acrisols", "Haplic Albeluvisols", "Histic Albeluvisols", "Umbric Albeluvisols"),
category3 = c("Acrisols", "Acrisols", "Acrisols", "Acrisols", "Acrisols", "Albeluvisols", "Albeluvisols", "Albeluvisols")
)
Solution
levels(r) <- att_table
par(mfrow=c(2,2))
for (i in 1:4) {
activeCat(r) <- i-1
plot(r, mar=c(1,1,0,10))
}
You can also do
levels(r) <- att_table
x <- rast(lapply(1:4, \(i) {activeCat(r) <- i-1; r}))
plot(x)
# extract values
x[9:10]
# id category1 category2 category3
#3 3 Haplic Acrisols (Ferric) Haplic Acrisols Acrisols
#6 6 Haplic Albeluvisols Haplic Albeluvisols Albeluvisols
But the most efficient way to extract all the attributes would be
e <- extract(r, 9:10, raw=TRUE)
# if the attributes are not ordered 1:n
# e <- match(e, att_table$id)
att_table[e,]
# id category1 category2 category3
#3 3 Haplic Acrisols (Ferric) Haplic Acrisols Acrisols
#6 6 Haplic Albeluvisols Haplic Albeluvisols Albeluvisols