pythonpython-3.xpython-subinterpreters

Are Python 3.12 subinterpreters multi-CPU?


Python 3.12 exposes the subinterpreter functionality: after starting the Python main.exe you can spawn multiple subinterpreter threads. Questions:

  1. As the subinterpreter threads run in the same process as the Python main thread, does that mean the subinterpreters can only use the cores of the CPU the main thread is running on?
  2. If so, how would you use subinterpreters that employ the full range of CPUs/cores available (ranging from 8-16 on modern hardware)?

Let me explain the context:

I'm using a C(++) reactor framework that brings its own task scheduling runtime: execution of a reactor's multiple reactions (methods) are scheduled over a pool of runtime workers.

Using the subinterpreter C-API, I'm trying to achieve that (using Python's subinterpreter functionality) reactions can call into Python code and update the reactor's PyThreadState. To that end,

  1. I start a Python main interpreter at that top-level reactor.
  2. I initialize each sub-reactor to have its own Python PyThreadState* and private GIL-owned subinterpreter:
PyInterpreterConfig py_config = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 1,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
};
this->py_status = Py_NewInterpreterFromConfig(
      &self->py_thread, &py_config);
assert(this->py_thread != NULL);
this->py_interp = PyThreadState_GetInterpreter(this->py_thread);
this->py_interp_id = _PyInterpreterState_GetIDObject(this->py_interp);
assert(this->py_interp_id != NULL);
PyObject *main_mod = _PyInterpreterState_GetMainModule(this->py_interp);
this->py_ns = PyModule_GetDict(main_mod);
Py_DECREF(main_mod);
Py_INCREF(this->py_ns);
const char* codestr =
    "id = \"Reactor_1\";"
    "import threading;"
    "print(f\"{id} @thread \" + 
str(threading.get_ident()));";
PyObject *result = PyRun_StringFlags(
    codestr, Py_file_input, this->py_ns, this->py_ns, NULL);

What I find though is that all Python (sub)interpreters (called upon in the various reactions) run in the same thread ...; using the subintepreter framework I would expect them to run on different cores (i.e. different threads) (Obviously, but IMO unrelatedly, the workers (the reactions are being scheduled on at runtime) run in separate threads.)

(Relatedly, what I find is that closing down the subinterpreters, like Py_EndInterpreter(this->py_thread);, in a finalizing reaction may crash the system when this->py_thread is not the current thread.)

Perhaps someone has an idea of how to resolve this?


Solution

  • I have some experience with the sub-interpreters in pure Python apps (they are available in 3.12 with import _xxsubinterpreters as interpreters - the subinterpreters will run in the thread they are started from - or, when one uses the interpreters.run_string(...) method, they will proceed in whatever thread the .run_string (in the main interpreter) was called from. The same OS Thread, as seen from Python, will be used for several interpreters - and thus, extra interpreter instances can´t execute concurrent work, regardless. -

    so, yes, you have to call in the functions from the other interpreters from other threads to have any gain in concurrency.