pythonprocessreferencepython-multiprocessingmultiprocessing-manager

What are the basic rules for propagating lists and dictionaries across processes using the Manager of the multiprocessing module?


I am trying to use the multiprocessing module to parallelize a CPU-intensive piece code over multiple cores. This module looks terrific in several respects, but when I try to pass lists and dictionaries between processes, changes are not always propagated across the processes as I would expect. What are the rules for this? For example, how do I propagate deep changes in nested lists and dictionaries between processes?

Below is a MRE to show a simple instance of an apparent failure of propagation. If I change shared_list in parent_func with .append or .extend, then the data is propagated to child_func, but if I try to change the list by setting it equal to a list outright, then propagation does not occur.

from time import sleep
from multiprocessing import Manager, Process

def parent_func(shared_list):
        sleep(1)
        shared_list.extend(['a','b','c'])  # propagated
        # shared_list = ['a','b','c']  # not propagated

def child_func(shared_list):
    k = 0
    while k < 8:
        k += 1
        print(f'{k}: {shared_list}')
        sleep(0.2)

def main_func():
    with Manager() as manager:
        shared_list = manager.list()

        process = Process(target=parent_func, args=(shared_list,))
        processes = [process]
        process.start()
        process = Process(target=child_func, args=(shared_list,))
        processes.append(process)
        process.start()

        for process in processes:
            process.join()

        print('---')
        print(list(shared_list))

if __name__ == '__main__':
    main_func()

For dictionaries, an example somewhat similar to the above is shown here.


What I have tried:

I have checked the multiprocessing documentation, but could not find much on this question there. As a separate issue, Google AI is currently displaying inline code phrases as empty gray boxes, so I am unable to obtain a Google AI summary on the topic.


Solution

  • When you write:

    shared_list = ['a','b','c']
    

    ...all you're doing is assigning a reference to a new list to a local variable called shared_list

    However, you could copy into it as follows:

    shared_list[:] = ["x","y","z"]
    

    So, here's a complete runnable example:

    import multiprocessing as mp
    
    
    def p1(list_):
        """ subprocess assigns new values to managed list """
        list_[:] = ["x", "y", "z"]
    
    
    if __name__ == "__main__":
        with mp.Manager() as manager:
            args = [manager.list("abc")]
            print(*args)
            (p := mp.Process(target=p1, args=args)).start()
            p.join()
            print(*args)
    

    Output:

    ['a', 'b', 'c']

    ['x', 'y', 'z']