pythonmemory-leaksmultiprocessingjoblib

Joblib Parallel doesn't terminate processes


I run the code in parallel in the following fashion:

grouped_data = Parallel(n_jobs=14)(delayed(function)(group) for group in grouped_data)

After the computation is done I can see all the spawned processes are still active and memory consuming in a system monitor:

enter image description here

And all these processes are not killed till the main process is terminated what leads to memory leak. If I do the same with multiprocessing.Pool in the following way:

pool = Pool(14)
pool.map(apply_wrapper, np.array_split(groups, 14))
pool.close()
pool.join()

Then I see that all the spawned processed are terminated in the end and no memory is leaked. However, I need joblib and it's loky backend since it allows to serialize some local functions.

How can I forcefully kill processes spawned by joblib.Parallel and release memory? My environment is the following: Python 3.8, Ubuntu Linux.


Solution

  • What I can wrap-up after invesigating this myself:

    1. joblib.Parallel is not obliged to terminate processes after successfull single invocation
    2. Loky backend doesn't terminate workers physically and it is intentinal design explained by authors: Loky Code Line
    3. If you want explicitly release workers you can use my snippet:
        import psutil
        current_process = psutil.Process()
        subproc_before = set([p.pid for p in current_process.children(recursive=True)])
        grouped_data = Parallel(n_jobs=14)(delayed(function)(group) for group in grouped_data)
        subproc_after = set([p.pid for p in current_process.children(recursive=True)])
        for subproc in subproc_after - subproc_before:
            print('Killing process with pid {}'.format(subproc))
            psutil.Process(subproc).terminate()
    
    1. The code above is not thread/process save. If you have another source of spawning subprocesses you should block it's execution.
    2. Everything is valid for joblib version 1.0.1