rcondablasrparallelbetareg

betareg not using multithreading on CentOS


Model Fitting Runs Single-Threaded on CentOS

I am fitting a mixture of Beta regressions model with the betamix function from the betareg package. I originally developed the code on Mac OS X, but am now running it (i.e., moving to at scale) on an HPC cluster with LSF for job management and CentOS on the nodes. For both situations I use Conda environments defined by the following YAML

betareg.yaml

name: betareg
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - r-base=4.0.3
  - r-tidyverse
  - r-magrittr
  - r-cowplot
  - r-knitr  
  - r-flexmix
  - r-betareg

On my local machine, the betamix step automatically scales to all available cores. On the cluster, however, where I deploy jobs via Snakemake and provided threads: 16, monitoring shows all jobs are running single-threaded despite the Snakemake logs clearly showing correct allocation of 16 cores per job.

Parallel Package?

Comparing sessionInfo() output for both situations showed a lack of parallel being loaded on HPC context. However, explicitly adding library(parallel) did not make a difference.

Identical BLAS Library Versions

Another thought was that perhaps the BLAS libraries were different, however, these also appear to match (though obviously platform-specific builds).

osx-64 BLAS

## Matrix products: default
## BLAS/LAPACK: /Users/user/miniconda3/envs/betareg/lib/libopenblasp-r0.3.12.dylib

linux-64 BLAS

## Matrix products: default
## BLAS/LAPACK: /home/user/mm-stem-cluster/.snakemake/conda/80842b70/lib/libopenblasp-r0.3.12.so

How can I get the CentOS execution to use all allocated threads?


Solution

  • Specify Threads with RhpcBLASctl

    The RhpcBLASctl package provides a method blas_set_num_threads() that appears sufficient to enable use of the specified number of threads. For this specific application, I updated the YAML to

    betareg.yaml

    name: betareg
    channels:
      - conda-forge
      - bioconda
      - defaults
    dependencies:
      - r-base=4.0.3
      - r-tidyverse
      - r-magrittr
      - r-cowplot
      - r-knitr  
      - r-flexmix
      - r-betareg
      - r-rhpcblasctl
    

    and added the following to set correct number of threads in the script:

    RhpcBLASctl::blas_set_num_threads(snakemake@threads)