rparallel-processing

Error in parallel processing in R when using the atakrig package


I am using the atakrig package. When setting the number of cores to 8 for parallel computation I am getting this error: 0%Error in serialize(data, node$con) : error writing to connection. If I set the number of cores to 4, the parallel processing is OK (no errors). I am getting this error when I am using the package's function ataStartCluster().

I found a workaround on this problem if I set cl <- parallel::makeCluster(8, type = "PSOCK") instead of the ataStartCluster(). I also compared the execution times when using the parallel::makeCluster() vs no parallel, to see if the parallel computation works as intended, and indeed it does.

When using cl <- parallel::makeCluster(8, type = "PSOCK") the execution time is 1.5 minutes whereas when NOT using the parallel code the execution time is 3.2 minutes. The thing is, I don't have this issue on my other laptop which has 4gb of RAM and Windows 10 but with the same R version and packages as the ones I am using on this laptop (please see session info below).

The code I am using:

library(atakrig)
library(terra)
rpath <- system.file("extdata", package="atakrig")
aod3k <- rast(file.path(rpath, "MOD04_3K_A2017042.tif"))
aod10 <- rast(file.path(rpath, "MOD04_L2_A2017042.tif"))
aod3k.d <- discretizeRaster(aod3k, 1500)
aod10.d <- discretizeRaster(aod10, 1500)
grid.pred <- discretizeRaster(aod3k, 1500, type = "all")
aod3k.d$areaValues$value <- log(aod3k.d$areaValues$value)
aod10.d$areaValues$value <- log(aod10.d$areaValues$value)
## area-to-area Kriging ---# point-scale variogram from combined AOD-3k and AOD-10
aod.combine <- rbindDiscreteArea(x = aod3k.d, y = aod10.d)
vgm.ok_combine <- deconvPointVgm(aod.combine, model="Exp", ngroup=12, rd=0.75)
# point-scale cross-variogram
aod.list <- list(aod3k=aod3k.d, aod10=aod10.d)
aod.list <- list(aod3k=aod3k.d, aod10=aod10.d)
vgm.ck <- deconvPointVgmForCoKriging(aod.list, model="Exp", ngroup=12, rd=0.75,fixed.range = 9e4)
# prediction
cl <- parallel::makeCluster(8, type = "PSOCK")
pred.ataok <- ataKriging(aod10.d, grid.pred, vgm.ck$aod10, showProgress = TRUE, nopar = FALSE)
parallel::stopCluster(cl)

Session info:

R version 4.4.3 (2025-02-28 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    


attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] terra_1.8-42    atakrig_0.9.8.1

loaded via a namespace (and not attached):
 [1] DBI_1.2.3          KernSmooth_2.23-26 sf_1.0-20          doSNOW_1.0.20      zoo_1.8-13         spacetime_1.3-3    xts_0.14.1        
 [8] e1071_1.7-16       snow_0.4-4         sp_2.2-0           gstat_2.1-3        grid_4.4.3         classInt_0.4-11    foreach_1.5.2     
[15] FNN_1.1.4.1        intervals_0.15.5   compiler_4.4.3     codetools_0.2-20   Rcpp_1.0.14        rstudioapi_0.17.1  lattice_0.22-7    
[22] class_7.3-23       parallel_4.4.3     magrittr_2.0.3     tools_4.4.3        proxy_0.4-27       iterators_1.0.14   units_0.8-7 

EDIT

By calling the package foreach (i.e., library(foreach)) I made the code run succefully. I don't know why and how that helped.


Solution

  • I'm not sure how ataKriging might use the clusters using just cl <- parallel::makeCluster(8, type = "PSOCK") without registerDoSNOW the cluster and set the options that ataKriging wants. Moreover you show nopar=TRUE to be set. Maybe try parallel::makePSOCKcluster which is "an enhanced version of snow::makeSOCKcluster".

    > ncpu <- parallel::detectCores() - 1L
    > 
    > cl <- parallel::makePSOCKcluster(ncpu)  ## make PSOCK cluster
    > doSNOW::registerDoSNOW(cl)  ## register cluster
    > options(ataKrigCluster=cl)  ## set options for atakrig
    > 
    > system.time(pred.ataok <- ataKriging(aod10.d, grid.pred, vgm.ck$aod10, showProgress=TRUE))
      |==================================================| 100%
       user  system elapsed 
      2.069   0.186  10.821 
    > 
    > parallel::stopCluster(cl)  ## stop cluster
    > 
    > head(pred.ataok)
      areaId    centx   centy     pred       var
    1      1 857313.2 4487148 3.246404 0.4251597
    2      2 860313.2 4487148 3.254721 0.4046833
    3      3 863313.2 4487148 3.238521 0.3841170
    4      4 866313.2 4487148 3.221157 0.3632869
    5      5 869313.2 4487148 3.202737 0.3421746
    6      6 872313.2 4487148 3.183429 0.3208301
    

    For me, ataStartCluster also works, I'm on Linux, though.

    > ataStartCluster(spec=ncpu)
    socket cluster with 15 nodes on host ‘localhost’
    > system.time(pred.ataok <- ataKriging(aod10.d, grid.pred, vgm.ck$aod10, showProgress=TRUE))
      |==================================================| 100%
       user  system elapsed 
      2.096   0.196  10.845 
    > ataStopCluster()
    

    Might be a problem with an outdated OS on one of your machines. (You might install Linux there, which is free ;)