pythonsubprocesshpcsingularity-container

How do I run many Singularity/Apptainer containers from one Python script, using multiple CPUs and nodes?


Problem statement

I have a Python program that needs to launch a number of Singularity containers in parallel.

Is it possible to do this, exploiting all of the available hardware, using only built-in libraries (subprocessing, concurrent.futures, etc)?

The 'host' script runs on 1 CPU. It is launched by SLURM. The 'host' needs to launch the containers, wait for them to complete, do some analysis, repeat.

For example, if I have 40 containers each needing 2 CPUs, and two nodes each with 76 CPUs, then there should be something like:

Node 1 (76 CPUs) Node 2 (76 CPUs)
Host script (1 CPU) 3 containers (6 CPUs)
37 containers (74 CPUs) 70 spare CPUs
1 spare CPU

MWE

Singularity recipe (stress.def)

We use stress to fully utilise a given number of CPUs:

Bootstrap: docker
From: ubuntu:16.04

%post
apt update -y
apt install -y stress 

%runscript
    echo $(uname -n)
    stress "$@"

Build with singularity build stress.simg stress.def.

Python host script (main.py)

Spin up 40 containers, each running the stress image with 2 CPUs for 10s:

from subprocess import Popen

n_processes = 40
cpus_per_process = 2
stress_time = 10

command = [
    "singularity",
    "run",
    "stress.simg",
    "-c",
    str(cpus_per_process),
    "-t",
    f"{stress_time}s",
]
processes = [Popen(command) for i in range(n_processes)]

for p in processes:
    p.wait()

SLURM script

#!/bin/bash
#SBATCH -J stress
#SBATCH -A myacc
#SBATCH -p mypart
#SBATCH --output=%x_%j.out
#SBATCH --nodes=2
#SBATCH --ntasks=40
#SBATCH --cpus-per-task=2
#SBATCH --time=24:00:00

python main.py

Results

The above only runs on one of the two nodes. Total execution time is around 20s, and the Singularity containers are run sequentially - the first 38 are run, and then the last two.

As such, it does not have the desired effect.


Solution

  • Turns out my question was just from a misunderstanding of what should be handled by Singularity and what should be handled by SLURM.

    My mistake was thinking that Singularity could see and utilise other nodes; in reality, it can only see resources available on the current node.

    Solution:

    1. Allocate all the resources that the overall job will need in the SLURM script, treating each container as a new task. So in the above example, that means setting ntasks=40 and cpus-per-task=2.
    2. Launch the subprocesses with srun. This will allow SLURM to allocate resources to the containers from the pool that have already been allotted to this particular run.

    Modified main.py:

    from subprocess import Popen
    
    n_processes = 40
    cpus_per_process = 2
    stress_time = 10
    
    command = [
        "srun", # <--------- MODIFICATION
        "singularity",
        "run",
        "stress.simg",
        "-c",
        str(cpus_per_process),
        "-t",
        f"{stress_time}s",
    ]
    processes = [Popen(srun_command) for i in range(n_processes)]
    
    for p in processes:
        p.wait()