gpucpuslurm

Running a job on CPU by default, but on GPU when available in Slurm


Is there a way to submit a job to Slurm with sbatch and use the gpu if available, but run on cpu if there is no gpu available?

Setting: #SBATCH --gres=gpu:1 only runs on nodes where a gpu is available. Omitting it or setting it to 0 never makes a gpu available.


Solution

  • There is unfortunately no direct solution in Slurm for this use case. A workaround can be to submit two jobs, one with --gres and the other without, and

    The above configuration will make sure only one job can be started by Slurm, and as soon as one starts, it cancels the other.