pythonpython-3.xgil

Why does this Python code with threading have race conditions?


This code creates a race condition:

import threading

ITERS = 100000
x = [0]

def worker():
    for _ in range(ITERS):
        x[0] += 1  # this line creates a race condition
        # because it takes a value, increments and then writes
        # some inrcements can be done together, and lost

def main():
    x[0] = 0  # you may use `global x` instead of this list trick too
    t1 = threading.Thread(target=worker)
    t2 = threading.Thread(target=worker)
    t1.start()
    t2.start()
    t1.join()
    t2.join()

for i in range(5):
    main()
    print(f'iteration {i}. expected x = {ITERS*2}, got {x[0]}')

Output:

$ python3 test.py
iteration 0. expected x = 200000, got 200000
iteration 1. expected x = 200000, got 148115
iteration 2. expected x = 200000, got 155071
iteration 3. expected x = 200000, got 200000
iteration 4. expected x = 200000, got 200000

Python3 version:

Python 3.9.7 (default, Sep 10 2021, 14:59:43) 
[GCC 11.2.0] on linux

I thought GIL would prevent it and not allow two threads run together until they do something io-related or call a C library. At least this is what you may conclude from the docs.

Turns out I was wrong. Then, what does GIL actually do, and when do threads run in parallel?


Solution

  • Reading the docs better, I think there's the answer:

    The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

    However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.

    I guess this means that each line of source code consists of multiple blocks of bytecode. Bytecode lines/blocks are atomic, i.e. they get executed alone, but the source lines aren't.

    Here's the byte code that +=1 exapands to (run dis.dis('x[0] += 1') to see):

              0 LOAD_NAME                0 (x)
              2 LOAD_CONST               0 (0)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_CONST               1 (1)
             10 INPLACE_ADD
             12 ROT_THREE
             14 STORE_SUBSCR
             16 LOAD_CONST               2 (None)
             18 RETURN_VALUE
    

    When these lines are executed in concurrent way, race condition occurs.

    So, GIL does not save you from it. It only prevents race conditions that could damage complex structures like list or dict.