I am trying to use the multiprocessing
module to parallelize a CPU-intensive piece code over multiple cores. This module looks terrific in several respects, but when I try to pass lists and dictionaries between processes, changes are not always propagated across the processes as I would expect. What are the rules for this? For example, how do I propagate deep changes in nested lists and dictionaries between processes?
Below is a MRE to show a simple instance of an apparent failure of propagation. If I change shared_list
in parent_func
with .append
or .extend
, then the data is propagated to child_func
, but if I try to change the list by setting it equal to a list outright, then propagation does not occur.
from time import sleep
from multiprocessing import Manager, Process
def parent_func(shared_list):
sleep(1)
shared_list.extend(['a','b','c']) # propagated
# shared_list = ['a','b','c'] # not propagated
def child_func(shared_list):
k = 0
while k < 8:
k += 1
print(f'{k}: {shared_list}')
sleep(0.2)
def main_func():
with Manager() as manager:
shared_list = manager.list()
process = Process(target=parent_func, args=(shared_list,))
processes = [process]
process.start()
process = Process(target=child_func, args=(shared_list,))
processes.append(process)
process.start()
for process in processes:
process.join()
print('---')
print(list(shared_list))
if __name__ == '__main__':
main_func()
For dictionaries, an example somewhat similar to the above is shown here.
What I have tried:
I have checked the multiprocessing
documentation, but could not find much on this question there. As a separate issue, Google AI is currently displaying inline code phrases as empty gray boxes, so I am unable to obtain a Google AI summary on the topic.
When you write:
shared_list = ['a','b','c']
...all you're doing is assigning a reference to a new list to a local variable called shared_list
However, you could copy into it as follows:
shared_list[:] = ["x","y","z"]
So, here's a complete runnable example:
import multiprocessing as mp
def p1(list_):
""" subprocess assigns new values to managed list """
list_[:] = ["x", "y", "z"]
if __name__ == "__main__":
with mp.Manager() as manager:
args = [manager.list("abc")]
print(*args)
(p := mp.Process(target=p1, args=args)).start()
p.join()
print(*args)
Output:
['a', 'b', 'c']
['x', 'y', 'z']