I am attempting to calculate the normalized difference vegetation index (NDVI) for a large directory of National Aerial Imagery Program (NAIP) imagery. I want to run it in parallel to speed up the calculation but I need to calculate it in a memory safe way. Reading other StackOverflow questions related to this topic, here are the morsels of wisdom I took away:
foreach()
allows for looping in parallellapp()
is the memory safe way to do raster algebra on multiple bandswrap()
to be run in parallelIn the function to calculate NDVI I have tried two approaches:
First, I tried reading in the raster and passing it to wrap()
, but it just produces a NULL value, but doesn't throw an error.
Second I removed the wrap()
and got this error:
Error in x@ptr$nrow() : external pointer is not valid
Here is what I have attempted so far.
##Loading Necessary Packages##
library(terra)
library(doParallel)
library(foreach)
##Creating Fake NAIP tiles##
inpath<-file.path(tempdir(), "NAIP")
if(!dir.exists(inpath)){
dir.create(inpath)
}
NAIP_1<-rast(xmin=45.5,xmax=45.55, ymin=-122.5, ymax=-122.45, nrows=50, ncols=50, crs="EPSG:4326", nlyrs=4)
NAIP_2<-rast(xmin=45.5,xmax=45.55, ymin=-122.45, ymax=-122.4, nrows=50, ncols=50, crs="EPSG:4326", nlyrs=4)
values(NAIP_1)<-sample(c(1:255), 10000, replace=TRUE)
values(NAIP_2)<-sample(c(1:255), 10000, replace=TRUE)
writeRaster(NAIP_1, filename = file.path(inpath, "FAKE_NAIP1.tif"))
writeRaster(NAIP_2, filename = file.path(inpath, "FAKE_NAIP2.tif"))
tmp<-list.files(inpath, pattern="*.tif$", full.names = TRUE)
##Creating output directory##
outpath<-file.path(tempdir(), "NDVI")
if(!dir.exists(outpath)){
dir.create(outpath)
}
##Creating the NDVI function##
NDVI_calc<- function(X, outpath) {
r<-terra::rast(X)
#r<-terra::wrap(terra::rast(X)) #the other options I tried
f<-basename(X)
o<-paste(outpath, f, sep="/")
if(!file.exists(o)){
NDVI_int<-function(n,r){
ndvi<-as.integer(10000*(n-r)/ (n+r))
return(ndvi)
}
terra::lapp(r[[c(4,3)]], fun=NDVI_int, filename=o, wopt=list(datatype="INT4S"), overwrite=TRUE)
}
}
##Setting up clusters for parallel processing##
ncores<-detectCores()-2
cl<-makeCluster(ncores)
registerDoParallel(cl)
##Running the function in parallel##
foreach(i = tmp, outpath=outpath) %dopar% {
NDVI_calc(i, outpath=outpath)
}
##Close the cluster connections
stopCluster(cl)
You can make it work like this:
NDVI_calc <- function(X, outpath) {
f <- basename(X)
out <- paste(outpath, f, sep="/")
if (!file.exists(out)){
r <- terra::rast(X)
NDVI_int <- function(n, r){
ndvi <- as.integer(10000*(n-r) / (n+r))
return(ndvi)
}
terra::lapp(r[[c(4,3)]], fun=NDVI_int, filename=out, wopt=list(datatype="INT4S"), overwrite=TRUE)
}
return(out)
}
ncores<-detectCores()-2
cl <-makeCluster(ncores)
registerDoParallel(cl)
out <- foreach(i = tmp) %dopar% {
NDVI_calc(i, outpath=outpath)
}
stopCluster(cl)
str(out)
#List of 2
# $ : chr "Rtmpw9tcyb/NDVI/FAKE_NAIP1.tif"
# $ : chr "Rtmpw9tcyb/NDVI/FAKE_NAIP2.tif"
That is, do not pass "outpath" to foreach
and return the filenames, not the SpatRasters.