pythonparallel-processingmultiprocessinglocking

Share lock between Python processes in other file


I am trying to use the multiprocessing map function to execute a function over my list. However in this function I do some API calls which can not be executed at the same time since it will overload the API. However my code is spread out over multiple files since it is quite long.

I have tried to follow to follow this solution however it does not work for me. I suspect it does not work because I am working with more than one file. The specific error I get is: NameError: name 'lock' is not defined.

Hereby a simplyfication of my code:

main.py

import file2

def evaluate_entry_of_list(entry):
    response = file2.get_data(entry)
    # some calculations
    return results

def init_pool(given_lock):
    global lock
    lock = given_lock

if __name__ == '__main__':

    list = apicaller.get_list()
    t_lock = Lock()
    with Pool(8, initializer=init_pool, initargs=(t_lock,)) as pool:
        results = pool.map(evaluate_entry_of_list, list)
        process_results(results)

file2.py

def make_call(url, body)-> requests.Response:
    lock.acquire()
    # Make API call
    lock.release()
    return response

Other solutions I tried: defining a variable in main and import to file2 and use separate class to create static variable and import in file2.


Solution

  • The simple solution is to place the pool initializer function in the same module as the worker function:

    file2.py

    def init_pool(given_lock):
        global lock
        lock = given_lock
    
    def make_call(url, body)-> requests.Response:
        lock.acquire()
        # Make API call
        lock.release()
        return response
    

    file1.py

    
    ...
    
    if __name__ == '__main__':
    
        list = apicaller.get_list()
        t_lock = Lock()
        with Pool(8, initializer=file2.init_pool, initargs=(t_lock,)) as pool:
            results = pool.map(evaluate_entry_of_list, list)
            process_results(results)