pythonpython-asynciocpythonevent-loop

Python Asyncio source code analysis: Why does `_get_running_loop` in Python execute the C implementation instead of the Python one?


I've been exploring the async source code and noticed that the function _get_running_loop() is defined both in Python and has a note stating it's implemented in C (in _asynciomodule.c).

# python3.11/asyncio/events.py
def get_running_loop():
    """Return the running event loop. Raise a RuntimeError if there is none.
    This function is thread-specific.
    """
    # NOTE: this function is implemented in C (see _asynciomodule.c)
    loop = _get_running_loop()
    if loop is None:
        raise RuntimeError('no running event loop')
    return loop

def _get_running_loop():
    """Return the running event loop or None.
    This is a low-level function intended to be used by event loops.
    This function is thread-specific.
    """
    # NOTE: this function is implemented in C (see _asynciomodule.c)
    running_loop, pid = _running_loop.loop_pid
    if running_loop is not None and pid == os.getpid():
        return running_loop

If _get_running_loop() is already defined in Python, why is it said the actual implementation is written in C?

The following code in Cpython looks like the real implementation:

static PyObject *
_asyncio_get_running_loop_impl(PyObject *module)
/*[clinic end generated code: output=c247b5f9e529530e input=2a3bf02ba39f173d]*/
{
    PyObject *loop;
    _PyThreadStateImpl *ts = (_PyThreadStateImpl *)_PyThreadState_GET();
    loop = Py_XNewRef(ts->asyncio_running_loop);
    if (loop == NULL) {
        /* There's no currently running event loop */
        PyErr_SetString(
            PyExc_RuntimeError, "no running event loop");
        return NULL;
    }
    return loop;
}

I'm aware that asyncio uses C extensions for performance reasons, but I’d like to understand how this mechanism works, specifically how Python binds the Python-level function to the C-level implementation.

Thank you!


Solution

  • Generally speaking in terms of Python's stdlib where a C-implementation is available for a given module, they must follow the requirements as outlined in PEP 399, namely that a pure Python implementation must be provided even if it has a C version going forward.

    Hence, for the affected stdlib module, they will be structured so that a pure-Python class/function is available, and that typically at the very end of those modules there will be a try: ... except ImportError: block that will attempt to import the C version, when import is sucessful, the C-based version shadows the Python-defined version, and on failure either no shadowing occurs or a pure Python version is then defined for compatibility.

    For example, the asyncio.events module has something like the following that would shadow the pure-Python implementation at the start of the file (example is truncated, all examples are lifted from Python 3.11 as per question - later versions of Python may have changed how the specifics are done):

    # Alias pure-Python implementations for testing purposes.
    _py__get_running_loop = _get_running_loop
    ...
    
    try:
        # get_event_loop() is one of the most frequently called
        # functions in asyncio.  Pure Python implementation is
        # about 4 times slower than C-accelerated.
        from _asyncio import (_get_running_loop, ...)
    except ImportError:
        pass
    else:
        # Alias C implementations for testing purposes.
        _c__get_running_loop = _get_running_loop
        ...
    

    Similarly, as further examples, for the heapq module, we have the following towards the bottom of the module which would shadow the Python implementation defined above it:

    # If available, use C implementation
    try:
        from _heapq import *
    except ImportError:
        pass
    ...
    

    Likewise for the datetime module, we have:

    try:
        from _datetime import *
    except ImportError:
        pass
    

    Which doesn't provide the same level of "support" for having concurrently available Python + C implementation, which this question has more details on (incidentally, Python 3.12 onwards have changed how this module is structured).

    Not all stdlib modules will shadow everything, for example, typing only tries to import _idfunc from the C module and should it fail provide an implementation in Python:

    try:        
        from _typing import _idfunc
    except ImportError:
        def _idfunc(_, x):
            return x    
    

    This answer is not meant to be an exhaustive listing of explanations of Python stdlib modules that have a C counterpart, but the goal of this answer is to serve as a general answer as to how Python's stdlib has Python code that gets shadowed by a C-based module, and that this override isn't magical and the specifics of how it is done is definitely found within the Python code within the affected module itself.