pythongil

Which functions release the GIL in Python?


I found this sentence about GIL on the Python wiki:

Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL.

Is there a list of functions outside the GIL (at least in Python standard libraries)?

Or how can I know whether a specific function is outside the GIL?


Solution

  • Starting from the original naming and tracing to the current implementation you can find these private functions:

    with these you can trace up to:

    in the ceval.c file. If you grep for those, you'll get to the parts of code that acquire or release the lock. If there is a release, you might assume GIL present in that module at least somewhere. The other side is looking up modules that do not have the lock acquiring, thus do not manipulate the GIL / run out of it.

    This should give you some steps to start tracing it, if you really want to go that way. However, I doubt there's a finite list of functions even for the standard library as the codebase is quite large/volatile to even keep a documentation for that. I'd like to be proven wrong though.

    Also there are these two macros, as pointed in the comments:

    which should find more matches in the code. (GitHub link might require login)

    Alternatively, in case it's completely locked out by mandatory login screen:

    git clone --depth 1 https://github.com/python/cpython
    grep -nr -C 5 Py_BEGIN_ALLOW_THREADS cpython
    

    For the quote you have:

    Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL.

    I'd rather go with the explanation that performance-dependent tasks are implemented in lower-level language (such as C; compared to Python) such as I/O, calculations, etc. And the modules implemented in C that do the hard work try not to acquire the lock (or release it beforehand) when working hard, then acquiring it when manipulating with the Python (interpreter's) context/variables so the result can be stored. Thus keeping the hard work on the performance level of its C implementation, not being slowed down by communicating with the interpreter's internals.