python-3.xpathos

How to efficiently share dicts and lists between processes using ProcessPool


Let's consider the following example:

from pathos.pools import ProcessPool

class A:
    def run(self, arg: int):

        my_list = list(...)
        my_dict = dict(...)

        def __run_parallel(arg: int):
            local_variable = 42

            # some code and read access...

            read_only1 = my_list[...]
            read_only2 = dict[...]


            # some code and write access...

            my_list.append(arg)
            my_dict[arg] = local_variable

        ProcessPool(4).map(__run_parallel, range(1000))

Since it seems as if list nor dict is thread-safe, I'm searching for a way to efficiently share access to these variables to all processes in the pool.

So far, I've tried to pass my_list and my_dict as additional arguments to __run_parallel using pa.helpers.mp.Manager. However, even though it works, it's horrendously slow (as it's obviously built for distributed systems).

Since I'm working on this in a trial and error session now for multiple evenings, I'd like to ask whether somebody knows how to efficiently use a shared dict and list within __run_parallel using pathos.


Solution

  • Converting both list and dict variables to pathos.helpers.mp.Array without an intermediate pa.helpers.mp.Manager as suggested by @Mike McKerns brought the desired performance boost.