bashmpislurmsbatch

Request maximum number of threads & cores on node via Slurm job scheduler


I have a heterogeneous cluster, containing either 14-core or 16-core CPUs (28 or 32 threads). I manage job submissions using Slurm. Some requirements:

To illustrate the peculiarities of the problem, I show a job script that works on the 16-core CPUs:

#!/bin/bash

#SBATCH -J test
#SBATCH -o job.%j.out
#SBATCH -N 1
#SBATCH -n 32

mpirun -np 16 vasp

An example job script that works on the 14-core CPUs is:

#!/bin/bash

#SBATCH -J test
#SBATCH -o job.%j.out
#SBATCH -N 1
#SBATCH -n 28

mpirun -np 14 vasp

The second job script runs on the 16-core CPUs but, unfortunately, the job is about 35% slower than when I request 32 threads as is done in the first script. That's an unacceptable performance loss for my application.

I haven't figured out if there is a good way around this challenge. To me, a solution would be to request a variable number of resources, such as

#SBATCH -n [28-32]

and to tailor the mpirun -np x vasp line accordingly. I haven't found a way to do this, however. Are there any suggestions on how to achieve this directly in Slurm or is there a good workaround?

I tried to use the environmental variable $SLURM_CPUS_ON_NODE, but this variable is only set after the node is selected, so cannot be used in a #SBATCH line.

I also looked at the --constraint flag but this does not seem to give sufficiently granular control over threading requests.


Solution

  • Actually it should work as you want it to by simply specifying that you want a full node:

    #!/bin/bash
    
    #SBATCH -J test
    #SBATCH -o job.%j.out
    #SBATCH -N 1
    #SBATCH --exclusive
    
    mpirun vasp
    

    mpirun will start the number of processes as defined in SLURM_TASKS_PER_NODE that will be set by Slurm to the number of tasks that can be created on the node, that is the number of CPUs if you do not request more than one CPU per task.