pythonmultiprocessingpython-multiprocessingnameerror

name 'list2_array' is not defined even when using multiprocessing.Array()


Problem

I was trying to use multiprocessing for a task which needed a shared array, when I encountered this error even after using multiprocessing.Array(): name 'list2_array' is not defined

Here is a simplified way to replicate it:

from multiprocessing import Pool, Array
from ctypes import c_char

def func(num):
    if num in list2_array:
        return num
    else:
        return num + b"Not Found"

if __name__ == "__main__":
    my_list = [b"1", b"2", b"4", b"8"]
    list2 = [b"1", b"4"]

    with Pool() as pool:
        list2_array = Array(c_char, list2)
        results = pool.map(func, my_list)

    print(results)

Expectation

In the example I tried to print all of the elements that was in both my_list and list2 using multiprocessing, the expected result was:

[b'1', b'2 Not Found', b'4', b'8 Not Found']

but a NameError was raised.

Thanks in advance!


Solution

  • Following on from my previous comment, here's one way you could achieve this. The key point here is the use of starmap to pass multiple arguments to your sub-process:

    from multiprocessing import Pool
    
    def func(num, list2):
        if num in list2:
            return num
        else:
            return num + b" Not Found"
    
    if __name__ == "__main__":
        my_list = [b"1", b"2", b"4", b"8"]
        list2 = [b"1", b"4"]
    
        with Pool() as pool:
            args = [(ml, list2) for ml in my_list]
            results = pool.starmap(func, args)
    
        print(results)
    

    Output:

    [b'1', b'2 Not Found', b'4', b'8 Not Found']