I was trying this new free-thread version of the interpreter, but find out that it actually takes longer than the GIL enabled version. I did observe that the usage on the CPU increase a lot for the free-thread interpreter, is there something I misunderstand about this new interpreter?
Version downloaded: python-3.13.0rc2-amd64
Code:
from concurrent.futures import ThreadPoolExecutor
from random import randint
import time
def create_table(size):
a, b = size
table = []
for i in range(0, a):
row = []
for j in range(0, b):
row.append(randint(0, 100))
table.append(row)
return table
if __name__ == "__main__":
start = time.perf_counter()
with ThreadPoolExecutor(4) as pool:
result = pool.map(create_table, [(1000, 10000) for _ in range(10)])
end = time.perf_counter()
print(end - start, *[len(each) for each in result])
python3.13t takes 56sec
python3.13 takes 26sec
python3.12 takes 25sec
The primary culprit appears to be the randint
module, as it is a static import and appears to share a mutex between threads. Another problem is that you're only able to process 4 tables at a time. Since you want to create 10 tables in total, you'll be running batches of 4-4-2.
Here is the code with the randint
problem addressed by replacing it with a SystemRandom
instance per thread:
from concurrent.futures import ThreadPoolExecutor
from random import SystemRandom
import time
def create_table(size):
a, b = size
table = []
random = SystemRandom()
for i in range(0, a):
row = []
for j in range(0, b):
row.append(random.randint(0, 100))
table.append(row)
return table
if __name__ == "__main__":
start = time.perf_counter()
with ThreadPoolExecutor(4) as pool:
result = pool.map(create_table, [(1000, 10000) for _ in range(10)])
end = time.perf_counter()
print(end - start, *[len(each) for each in result])
And here is some code that achieves the same thing, but is more flexible with the thread creation and avoids unnecessary inter-thread communication:
import threading
from random import SystemRandom
import time
def create_table(obj, result: list[list[int]]):
a, b = obj
print(f"Starting thread {threading.current_thread().name}")
random = SystemRandom()
result[:] = [[random.randint(0, 100) for j in range(b)] for i in range(a)]
print(f"Finished thread {threading.current_thread().name}")
if __name__ == "__main__":
start = time.perf_counter()
obj = (1000, 10000)
results: list[list[list[int]]] = []
threads: list[threading.Thread] = []
for _ in range(4):
result: list[list[int]] = []
thread = threading.Thread(target=create_table, args=(obj, result))
thread.start()
threads.append(thread)
results.append(result)
for thread in threads:
thread.join()
print([len(r) for r in results])
end = time.perf_counter()
print(end - start)