The loop runs over some number of models (n_mod
) and is distributed among the n_cpu
s. As you will note, running this code as mpirun -np 4 python test_mpi.py
produces 4 progress bars. This is understandable. But is there a way to use tdqm
to get one progress bar which tells me how many models have been completed?
from tqdm import tqdm
from mpi4py import MPI
import time
comm = MPI.COMM_WORLD
cpu_ind = comm.Get_rank()
n_cpu = comm.Get_size()
n_mod=100
for i in tqdm(range(n_mod)):
if (cpu_ind == int(i/int(n_mod/n_cpu))%n_cpu):
#some task here which is a function of i
time.sleep(0.02)
This can be achieved with a main node and worker node(s) setup.
Essentially only the rank == 0 node will be updating a progress bar whilst the worker nodes will simply be informing the main node that they have completed the task.
Worker:
(Defined similarly to your above code)
def worker(n_mod, size, rank):
comm = MPI.COMM_WORLD
step = size - 1 # Number of worker processes
start = rank - 1
for i in range(start, n_mod, step):
time.sleep(0.02) # Work
# Notify the master that a task is done
comm.send('done', dest=0, tag=1)
Main Node:
def pbar_node(n_mod, size):
comm = MPI.COMM_WORLD
pbar = tqdm(total=n_mod, desc="Processing Models")
completed = 0
while completed < n_mod:
# Receive a message from any worker
status = MPI.Status()
comm.recv(source=MPI.ANY_SOURCE, tag=1, status=status)
completed += 1
pbar.update(1)
pbar.close()
I have tested this with mpirun -np 4 python test.py and get a single progress bar for 4 models sharing 100 tasks as shown in the screenshot.
Ensure that you take into account the fact there will be the number of workers plus a main node when dealing with distributing tasks so you do not get off-by-one errors.
You can use these like so:
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
n_mod = N_TASKS
if rank == 0:
# pbar process
pbar_node(n_mod)else:
# Worker processes
worker(n_mod, size, rank)