I am fitting a mixture of Beta regressions model with the betamix
function from the betareg package. I originally developed the code on Mac OS X, but am now running it (i.e., moving to at scale) on an HPC cluster with LSF for job management and CentOS on the nodes. For both situations I use Conda environments defined by the following YAML
betareg.yaml
name: betareg
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- r-base=4.0.3
- r-tidyverse
- r-magrittr
- r-cowplot
- r-knitr
- r-flexmix
- r-betareg
On my local machine, the betamix
step automatically scales to all available cores. On the cluster, however, where I deploy jobs via Snakemake and provided threads: 16
, monitoring shows all jobs are running single-threaded despite the Snakemake logs clearly showing correct allocation of 16 cores per job.
Comparing sessionInfo()
output for both situations showed a lack of parallel
being loaded on HPC context. However, explicitly adding library(parallel)
did not make a difference.
Another thought was that perhaps the BLAS libraries were different, however, these also appear to match (though obviously platform-specific builds).
osx-64 BLAS
## Matrix products: default
## BLAS/LAPACK: /Users/user/miniconda3/envs/betareg/lib/libopenblasp-r0.3.12.dylib
linux-64 BLAS
## Matrix products: default
## BLAS/LAPACK: /home/user/mm-stem-cluster/.snakemake/conda/80842b70/lib/libopenblasp-r0.3.12.so
How can I get the CentOS execution to use all allocated threads?
The RhpcBLASctl package provides a method blas_set_num_threads()
that appears sufficient to enable use of the specified number of threads. For this specific application, I updated the YAML to
betareg.yaml
name: betareg
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- r-base=4.0.3
- r-tidyverse
- r-magrittr
- r-cowplot
- r-knitr
- r-flexmix
- r-betareg
- r-rhpcblasctl
and added the following to set correct number of threads in the script:
RhpcBLASctl::blas_set_num_threads(snakemake@threads)