pythonmultithreading

python multithreading is faster than sequential code ... why?


In many stack overflow Q&A about python multi-threading, I read that python has GIL so multi-threading is slower than sequential code.

But in my code it doesn't look like

This is multi-threading code code updated 02-21-2023

import threading
import time

global_v = 0
thread_lock = threading.Lock()


def thread_test(num):
    thread_lock.acquire()
    global global_v
    for _ in range(num):
        global_v += 1
    thread_lock.release()


# thread run
thread_1 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_2 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_3 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_4 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_5 = threading.Thread(target=thread_test, args=(9_000_000,))

thread_start = time.perf_counter()
# start thread
thread_1.start()
thread_2.start()
thread_3.start()
thread_4.start()
thread_5.start()
thread_end = time.perf_counter()

thread_1.join()
thread_2.join()
thread_3.join()
thread_4.join()
thread_5.join()
print(f"multithread run takes {thread_end-thread_start:.5f} sec")


# nomal run (sequential code)
def increment():
    global nomal_result

    for _ in range(45_000_000):
        nomal_result += 1


nomal_result = 0

start_time = time.perf_counter()
increment()
end_time = time.perf_counter()

print(f"nomal run takes {end_time-start_time:.5f} sec")

The result is

multithread run takes 0.21226 sec
nomal run takes 2.09347 sec

Consequently my question is this

Q1. Why threading is faster than sequential code in python?

Q2. What is the different between multi-threading with lock and sequential code (I think if using a lock, the code works like sequential codes with blocking)

Please let me know !

thanks you

python version 3.8.10

ps. I move my 3rd question Python multi-threading with lock is much faster why?


Solution

  • It looks like it is caused by the way CPython treats globals. This sequential version is faster than your concurrent one using CPython 3.11 on my machine:

    def increment():
        nomal_result = 0
    
        for _ in range(5_000_000):
            nomal_result += 1
    
    
    nomal_result = 0
    
    start_time = time.perf_counter()
    increment()
    end_time = time.perf_counter()
    
    print(f"nomal run takes {end_time-start_time:.5f} sec")
    

    Your multithreaded code is thus not faster than your sequential one. The performance gap is likely due to different CPython optimizations between the two versions and it's mostly irrelevant.

    The GIL does not prevent all code to be efficiently multithreaded. Here is a simple counter-example:

    import threading
    import time
    
    
    def thread_test():
        time.sleep(1.0)
    
    
    thread_start = time.perf_counter()
    
    thread_1 = threading.Thread(target=thread_test)
    thread_2 = threading.Thread(target=thread_test)
    
    thread_1.start()
    thread_2.start()
    
    thread_1.join()
    thread_2.join()
    
    thread_end = time.perf_counter()
    
    print(f"Task duration: {thread_end - thread_start:.5f} sec")
    

    compared to:

    import time
    
    
    def thread_test():
        time.sleep(1.0)
    
    
    thread_start = time.perf_counter()
    thread_test()
    thread_test()
    thread_end = time.perf_counter()
    
    print(f"Task duration: {thread_end - thread_start:.5f} sec")
    

    The multithreaded version only takes 1 second while the sequential one takes 2 seconds.

    As a general rule, code that heavily call C functions and release the GIL (like as NumPy) and code that is IO-bound (network calls) will benefit from multithreading in Python. On the contrary, CPU-bound tasks such as your code won't benefit from it.