rparallel-processingblasmgcv

How to control (BLAS?) parallelization when using mgcv::gam


I am running some fairly large gam models and don't want to parallelize the computations, or at least want to be able to control the degree of parallelization. (Besides not wanting to fry my machine with long stretches of full CPU utilization, I also suspect that over-parallelization is actually hurting performance.)

I am using the default gam.control() settings of nthreads=1, ncv.threads=1. Nevertheless, the computation seems to be using most or all of the 16 cores on my machine [CPU usage as measured by top goes as high as 1590%.]

I assume this is happening because computations via BLAS are being automatically parallelized, but it's not obvious how to control this.

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

Other questions about parallelization in gam don't seem to be relevant.


Solution

  • It appears that setting export OPENBLAS_NUM_THREADS=<whatever> works (in a bash shell, may need to be adjusted for different shells/OSs). However, this has to be done before the R session starts; Sys.setenv(OPENBLAS_NUM_THREADS=<whatever>) seems to be ineffective.

    @Henrik points out in comments that RhpcBLASctl::blas_set_num_threads(<whatever>) works too.