rforeachparallel-processingrasterunmarked-package

Matching rasters do not work in foreach loop


I'm working on a habitat occupancy prediction encompassing the entire state of Wyoming. Certain site covariate rasters work in the prediction while others with matched resolution, extent, etc. do not.

A short reproduceable example of my code is below. After extensive troubleshooting I've found I have 3 rasters of the 5 I need to use that cause this script to fail, all with the same error. I'm assuming my rasters have somehow become corrupted(?) but wanted to see if anyone has another idea on what could be happening.

Data is at this link. The data is the unmarked object (saved as .rds) and 2 very small clips off of: 1. the raster that works, and 2. one of the rasters that does not work

Steps I took to originally align the rasters for stacking - for information purposes

#ndvi <- raster(paste(getwd(), "./Original_rasters/ndvi_summer.TIF", sep = ""))

#precip <- raster(paste(getwd(), "./Original_rasters/bioclim15.TIF", sep = ""))

#temp <- projectExtent(original, original)
#res(temp) <- 220

#sNoJoy <- resample(ndvi, temp)
#sampleJoy <- resample(precip, temp)

Start here for reproducing error

library(raster)
library(unmarked)
sampleData <- readRDS("./sampleData.RDS")

# Formula referencing raster that does not work
fmTest <- occu(~ Day + Min_TempC + AvailTN_prop ~ ndvi.summer, 
               sampleData)
fmTest

# Formula referencing raster that DOES work
#fmTest <- occu(~ Day + Min_TempC + AvailTN_prop ~ bc15_220, 
#               sampleData)

# Formula for both variables that fails also
#fmTest <- occu(~ Day + Min_TempC + AvailTN_prop ~ ndvi.summer + bc15_220,
#               sampleData)

# Load and name rasters so formulas can find them
sampleJoy <- raster(paste(getwd(), "./sampleGoodRas.TIF", sep = ""))
names(sampleJoy) <- "bc15_220"
sNoJoy <- raster(paste(getwd(), "./sampleBadRas.TIF", sep = ""))
names(sNoJoy) <- "ndvi.summer"

compareRaster(sampleJoy, sNoJoy) # Returns "[1] TRUE"
 
# pm <- stack(sampleJoy)
pm <- stack(sNoJoy)
# pm <- stack(sNoJoy, sampleJoy)

###########################Combine Function#####################################

comb <- function(x, ...) {
  mapply("rbind", x, ..., SIMPLIFY = F)
}

# This combine function allows foreach to return a list containing multiple 
# matrices making it easy to insert results into raster templates

############################ Foreach Code #####################################
#Assemble cluster for parallel processing.  Code works in Windows or other O/S#

ifelse(Sys.info()["sysname"] != "Windows", 
       c(require(doMC), nc <- detectCores()-1, registerDoMC(nc)),
       c(require(doParallel), nc <- detectCores()-1, cl <- makeCluster(nc), 
         registerDoParallel(cl)))

# Foreach loop returning predicted values, SE, LCI, and UCI
pred <- foreach(i = 1:nrow(pm), .combine = comb, .multicombine = T,
                .maxcombine = 90,
                .packages = c("unmarked", "raster")) %dopar% {
  
  # make raster into a data.frame row by row for prediction
  tmp <- as.data.frame(pm[i,], xy = T)
  
  # Predict the new data
  pred <- predict(fmTest, "state", tmp)
  
  # Make a list of 4 matrices to retrieve them from the loop      
  list(Predicted = pred$Predicted,
       SE = pred$SE,
       lower = pred$lower,
       upper = pred$upper)
}

# Close the cluster
stopCluster(cl)

## Using sampleJoy produces a list of 4 matrices which are easily coerced into
## raster format: Prediction, SE, lower, and upper, as it should.

## Using sNoJoy produces:  
# Error in { : 
# task 1 failed - "Matrices must have same number of rows in 
# cbind2(.Call(dense_to_Csparse, x), y)" 

## Rasters are the same extent, same origin, same resolution, etc.
# 

### Not Working
# > pm
# class      : RasterStack 
# dimensions : 15, 2675, 40125, 1  (nrow, ncol, ncell, nlayers)
# resolution : 220, 220  (x, y)
# extent     : 201539.7, 790039.7, 647050.2, 650350.2  (xmin, xmax, ymin, ymax)
# crs        : +proj=lcc +lat_0=41 +lon_0=-107.5 +lat_1=41 +lat_2=45 +x_0=500000 +y_0=200000 +datum=NAD83 +units=m +no_defs 
# names      : ndvi.summer 
# min values :  0.09507491 
# max values :   0.8002191 
# 

### Working Raster
# > pm
# class      : RasterStack 
# dimensions : 15, 2675, 40125, 1  (nrow, ncol, ncell, nlayers)
# resolution : 220, 220  (x, y)
# extent     : 201539.7, 790039.7, 647050.2, 650350.2  (xmin, xmax, ymin, ymax)
# crs        : +proj=lcc +lat_0=41 +lon_0=-107.5 +lat_1=41 +lat_2=45 +x_0=500000 +y_0=200000 +datum=NAD83 +units=m +no_defs 
# names      : bc15_220 
# min values :       14 
# max values :       66

Solution

  • Answer

    The error arises because you have missings in sNoJoy. Had those not been missing, it would have worked just fine.

    Question rewritten

    Your problem has nothing to do with your parallel code. It boils down to this:

    fmTest <- occu(~ Day + Min_TempC + AvailTN_prop ~ ndvi.summer, sampleData)
    fmTest2 <- occu(~ Day + Min_TempC + AvailTN_prop ~ bc15_220, sampleData)
    
    pm <- stack(sNoJoy)
    pm2 <- stack(sampleJoy)
    
    tmp <- as.data.frame(pm[1,], xy = T)
    tmp2 <- as.data.frame(pm2[1,], xy = T)
    
    pred <- predict(fmTest, "state", tmp) # fails
    pred2 <- predict(fmTest2, "state", tmp2) # works
    

    Rationale

    As it turns out, your bad raster has missing values:

    table(is.na(sNoJoy[]))
    #FALSE  TRUE 
    #35998  4127 
    

    If we artificially remove the NAs in sNoJoy and randomly write 1 NA in sampleJoy, then the states flip:

    sNoJoy[is.na(sNoJoy[])] <- 1
    sampleJoy[10] <- NA
    
    ### run same code as above
    
    pred <- predict(fmTest, "state", tmp) # now works
    pred2 <- predict(fmTest2, "state", tmp2) # now fails
    

    Thus, I would try to figure out why you have NAs to begin with.