pythonmultiprocessingcpu-speed

Why does per-process overhead constantly increase for multiprocessing?


I was counting for a 6 core CPU with 12 logical CPUs in a for-loop till really high numbers several times.

To speed things up i was using multiprocessing. I was expecting something like:

What i was finding was a continuous increase in time. I'm confused.

the code was:

#!/usr/bin/python

from multiprocessing import Process, Queue
import random
from timeit import default_timer as timer

def rand_val():
    num = []
    for i in range(200000000):
        num = random.random()
    print('done')

def main():

    for iii in range(15):
        processes = [Process(target=rand_val) for _ in range(iii)]
        start = timer()
        for p in processes:
            p.start()

        for p in processes:
            p.join()

        end = timer()
        print(f'elapsed time: {end - start}')
        print('for ' + str(iii))
        print('')

if __name__ == "__main__":
    main()
    print('done')

result:

. . .


Solution

  • There are two wrong assumptions you make:

    1. Processes are not free. Merely adding processes adds overhead to the program.
    2. Processes do not own CPUs. A CPU interleaves execution of several processes.

    The first point is why you see some overhead even though there are less processes than CPUs. Note that your system usually has several background processes running, so the point of "less processes than CPUs" is not clearcut for a single application.

    The second point is why you see the execution time increase gradually when there are more processes than CPUs. Any OS running mainline Python does preemptive multitasking of processes; roughly, this means a process does not block a CPU until it is done, but is paused regularly so that other processes can run.
    In effect, this means that several processes can run on one CPU at once. Since the CPU can still only do a fixed amount of work per time, all processes take longer to complete.