pythonparallel-processingjoblibtqdm

How can we use tqdm in a parallel execution with joblib?


I want to run a function in parallel, and wait until all parallel nodes are done, using joblib. Like in the example:

from math import sqrt
from joblib import Parallel, delayed
Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))

But, I want that the execution will be seen in a single progressbar like with tqdm, showing how many jobs has been completed.

How would you do that?


Solution

  • If your problem consists of many parts, you could split the parts into k subgroups, run each subgroup in parallel and update the progressbar in between, resulting in k updates of the progress.

    This is demonstrated in the following example from the documentation.

    >>> with Parallel(n_jobs=2) as parallel:
    ...    accumulator = 0.
    ...    n_iter = 0
    ...    while accumulator < 1000:
    ...        results = parallel(delayed(sqrt)(accumulator + i ** 2)
    ...                           for i in range(5))
    ...        accumulator += sum(results)  # synchronization barrier
    ...        n_iter += 1
    

    https://pythonhosted.org/joblib/parallel.html#reusing-a-pool-of-workers