Let's consider the following example:
from pathos.pools import ProcessPool
class A:
def run(self, arg: int):
my_list = list(...)
my_dict = dict(...)
def __run_parallel(arg: int):
local_variable = 42
# some code and read access...
read_only1 = my_list[...]
read_only2 = dict[...]
# some code and write access...
my_list.append(arg)
my_dict[arg] = local_variable
ProcessPool(4).map(__run_parallel, range(1000))
Since it seems as if list
nor dict
is thread-safe, I'm searching for a way to efficiently share access to these variables to all processes in the pool.
So far, I've tried to pass my_list
and my_dict
as additional arguments to __run_parallel
using pa.helpers.mp.Manager
. However, even though it works, it's horrendously slow (as it's obviously built for distributed systems).
Since I'm working on this in a trial and error session now for multiple evenings, I'd like to ask whether somebody knows how to efficiently use a shared dict
and list
within __run_parallel
using pathos
.
Converting both list
and dict
variables to pathos.helpers.mp.Array
without an intermediate pa.helpers.mp.Manager
as suggested by @Mike McKerns brought the desired performance boost.