In many stack overflow Q&A about python multi-threading, I read that python has GIL so multi-threading is slower than sequential code.
But in my code it doesn't look like
This is multi-threading code code updated 02-21-2023
import threading
import time
global_v = 0
thread_lock = threading.Lock()
def thread_test(num):
thread_lock.acquire()
global global_v
for _ in range(num):
global_v += 1
thread_lock.release()
# thread run
thread_1 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_2 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_3 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_4 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_5 = threading.Thread(target=thread_test, args=(9_000_000,))
thread_start = time.perf_counter()
# start thread
thread_1.start()
thread_2.start()
thread_3.start()
thread_4.start()
thread_5.start()
thread_end = time.perf_counter()
thread_1.join()
thread_2.join()
thread_3.join()
thread_4.join()
thread_5.join()
print(f"multithread run takes {thread_end-thread_start:.5f} sec")
# nomal run (sequential code)
def increment():
global nomal_result
for _ in range(45_000_000):
nomal_result += 1
nomal_result = 0
start_time = time.perf_counter()
increment()
end_time = time.perf_counter()
print(f"nomal run takes {end_time-start_time:.5f} sec")
The result is
multithread run takes 0.21226 sec
nomal run takes 2.09347 sec
Consequently my question is this
Q1. Why threading is faster than sequential code in python?
Q2. What is the different between multi-threading with lock and sequential code (I think if using a lock, the code works like sequential codes with blocking)
Please let me know !
thanks you
python version 3.8.10
ps. I move my 3rd question Python multi-threading with lock is much faster why?
It looks like it is caused by the way CPython treats globals. This sequential version is faster than your concurrent one using CPython 3.11 on my machine:
def increment():
nomal_result = 0
for _ in range(5_000_000):
nomal_result += 1
nomal_result = 0
start_time = time.perf_counter()
increment()
end_time = time.perf_counter()
print(f"nomal run takes {end_time-start_time:.5f} sec")
Your multithreaded code is thus not faster than your sequential one. The performance gap is likely due to different CPython optimizations between the two versions and it's mostly irrelevant.
The GIL does not prevent all code to be efficiently multithreaded. Here is a simple counter-example:
import threading
import time
def thread_test():
time.sleep(1.0)
thread_start = time.perf_counter()
thread_1 = threading.Thread(target=thread_test)
thread_2 = threading.Thread(target=thread_test)
thread_1.start()
thread_2.start()
thread_1.join()
thread_2.join()
thread_end = time.perf_counter()
print(f"Task duration: {thread_end - thread_start:.5f} sec")
compared to:
import time
def thread_test():
time.sleep(1.0)
thread_start = time.perf_counter()
thread_test()
thread_test()
thread_end = time.perf_counter()
print(f"Task duration: {thread_end - thread_start:.5f} sec")
The multithreaded version only takes 1 second while the sequential one takes 2 seconds.
As a general rule, code that heavily call C functions and release the GIL (like as NumPy) and code that is IO-bound (network calls) will benefit from multithreading in Python. On the contrary, CPU-bound tasks such as your code won't benefit from it.