pythonmultithreadingparallel-processingpython-multiprocessingprocess-pool

python multiprocessing pool blocking main thread


I have the following snippet which attempts to split processing across multiple sub-processes.

def search(self):
    print("Checking queue for jobs to process")
    if self._job_queue.has_jobs_to_process():

        print("Queue threshold met, processing jobs.")
        job_sub_lists = partition_jobs(self._job_queue.get_jobs_to_process(), self._process_pool_size)
        populated_sub_lists =  [sub_list for sub_list in job_sub_lists if len(sub_list) > 0]
        self._process_pool.map(process, populated_sub_lists)
        print("Job processing pool mapped")

The search function is being called by the main process in a while loop and if the queue reaches a threshold count, the processing pool is mapped to the process function with the jobs sourced from the queue. My question is, does the python multiprocessing pool block the main process during execution or does it immediately continue execution? I don't want to encounter the scenario where "has_jobs_to_process()" evaluates to true and during the processing of the jobs, it evaluates to true for another set of jobs and "self._process_pool.map(process, populated_sub_lists)" is called again as I do not know the consequences of calling map again while processes are running.


Solution

  • multiprocessing.Pool.map blocks the calling thread (not necessarily the MainThread!), not the whole process. Other threads of the parent process will not be blocked. You could call pool.map from multiple threads in the parent process without breaking things (doesn't make much sense, though). That's because Pool uses thread-safe queue.Queue internally for it's _taskqueue.