I wan to run a code using multiprocessing in a server with slurm architecture. I want to limit the number of cpus available and that the code creates a child process for every of them.
My code could be simplified in this way:
def Func(ins) :
###
things...
###
return var
if __name__ == '__main__' :
from multiprocessing import Pool
from multiprocessing import active_children
from multiprocessing import cpu_count
p = Pool()
print("active cpus = ", cpu_count())
print("open process = ", p._processes)
print("active_children = ", len(active_children()))
results = p.map(Func, range(2000))
p.close()
exit()
ruled by this bash script:
#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=48
#SBATCH --mem=40000 # Memory per node (in MB).
module load python
conda activate myenv
python3 test.py
echo 'done!'
What I get is that the code runs every time on the maximum number of cpus (272), whatever combination of parameters I try:
active cpus = 272
open process = 272
active_children = 272
done!
I launch the job with the command
sbatch job.sh
What I'm doing wrong?
Your Python code is responsible for creating the wanted number of processes based on the Slurm allocation.
If you want, as is often the case, to have one process per allocated CPU, your code should look like this:
if __name__ == '__main__' :
from multiprocessing import Pool
from multiprocessing import active_children
from multiprocessing import cpu_count
ncpus = int(os.environ['SLURM_CPUS_PER_TASK'])
p = Pool(ncpus)
print("active cpus = ", cpu_count())
print("open process = ", p._processes)
print("active_children = ", len(active_children()))
results = p.map(Func, range(2000))
p.close()
exit()
The SLURM_CPUS_PER_TASK
environment variable will hold the value you specify in the #SBATCH --cpus-per-task=48
line in the submission script.