Why does defining the worker's run method using a nested function in the constructor cause it to be run in the main thread?

When trying to have some code executing in a different thread in Qt using PySide6, I'm making use of Python's principle of retaining state in a closure when a function is nested in another function inside the constructor of a worker class like this:

class _ConcurrentWorker(QObject):
    success = Signal()
    failed = Signal(str)
    finished = Signal()
    quit = Signal()

    def __init__(self, do_work: Callable[[], None], repeat):
        super().__init__()
        self._loop = True
        self.quit.connect(self.on_quit)

        def run():
            try:
                while self._loop:
                    do_work()
                    self.success.emit()
                    if not repeat:
                        break
            except Exception as e:
                self.failed.emit(repr(e))
            finally:
                self.finished.emit()
        self.run = run

    def on_quit(self):
        self._loop = False

This is wrapped by another class ConcurrentTask which creates a QThread and moves the worker to it.

from typing import Callable, Optional
from PySide6.QtCore import QObject, QThread, Signal

class ConcurrentTask:
    class _ConcurrentWorker(QObject):
        success = Signal()
        failed = Signal(str)
        finished = Signal()
        quit = Signal()

        def __init__(self, do_work: Callable[[], None], repeat):
            super().__init__()
            self._loop = True
            self.quit.connect(self.on_quit)

            def run():
                try:
                    while self._loop:
                        do_work()
                        self.success.emit()
                        if not repeat:
                            break
                except Exception as e:
                    self.failed.emit(repr(e))
                finally:
                    self.finished.emit()
            self.run = run

        def on_quit(self):
            self._loop = False

    def __init__(self, do_work: Callable[[], None], on_success: Callable[[], None] = None, on_failed: Callable[[str], None] = None, repeat=False):

        def create_worker():
            worker = self._ConcurrentWorker(do_work, repeat)
            if on_success:
                worker.success.connect(on_success)
            if on_failed:
                worker.failed.connect(on_failed)
            return worker

        self.create_worker = create_worker
        self.worker: Optional[_ConcurrentWorker] = None
        self.thread: Optional[QThread] = None

    def _setup_task(self):
        self.worker = self.create_worker()
        self.thread = QThread()
        self.worker.moveToThread(self.thread)
        self.thread.started.connect(self.worker.run)
        self.worker.finished.connect(self.thread.quit)
        self.worker.finished.connect(self.worker.deleteLater)
        self.thread.finished.connect(self.thread.deleteLater)

    def start(self):
        self._setup_task()
        self.thread.start()

Let's say the function provided by the constructor argument do_work looks something like this:

from time import sleep

def do_work():
    sleep(2)

What happens now when a program calls start() on an instance of ConcurrentTask is that the method run of _ConcurrentWorker seems to be executing in the main thread, ultimately blocking the GUI, instead of the QThread that the worker has been moved to.

Can anyone explain what I'm missing here since I don't understand this behaviour.

This code works fine if I define _ConcurrentWorker like this:

class _ConcurrentWorker(QObject):
    success = Signal()
    failed = Signal(str)
    finished = Signal()
    quit = Signal()

    def __init__(self, do_work: Callable[[], None], repeat):
        super().__init__()
        self._do_work = do_work
        self._repeat = repeat
        self._loop = True
        self.quit.connect(self.on_quit)

    def run(self):
        try:
            while self._loop:
                self._do_work()
                self.success.emit()
                if not self._repeat:
                    break
        except Exception as e:
            self.failed.emit(repr(e))
        finally:
            self.finished.emit()

    def on_quit(self):
        self._loop = False

Solution

Qt thread affinity

While your usage of "state in a closure" may seem fine, it is not completely valid in the Qt world, where signals call their receiving slots in the thread that has (or not) a thread affinity.

I cannot go too into deep about the intricacies of PySide (and PyQt), but you have to remember that, these modules are Python bindings of a C++ library, with all its pros and cons.
And mixing python threading and Qt threading may be a risky matter if you are not careful enough.

Simplifying, from the python perspective, the self.run reference you're creating was made in the main thread, and Qt will "see" it as that, meaning that it will "think" that the thread.started.connect(worker.run) connection will be a direct connection: it will be run in the caller thread, the main one.

You can see it by adding a basic print(int(QThread.currentThreadId())) after creating the QApplication instance and at the beginning of your run function: they will be the same, so your do_work function will actually be blocking.

A simple "solution" (but, is it? wait for it...)

Interestingly enough, the issue can be worked around by doing a simple modification:

class ConcurrentTask:
    class _ConcurrentWorker(QObject):
        # ...
        def __init__(self, do_work, repeat):
            # ...
            self._run = run

        def run(self):
            self._run()

That will work because the Python/Qt mechanism will find the "affinity" above based on the method ownership: the instance has been moved to a thread, and that method (a member of an instance, not a simple function) will be called in the receiver thread.

...No, it is not (aka: why bother?)

The reality, though, is that you are making things more complicated than they could (or should):

it doesn't (usually) make a lot of sense to declare a function in a local scope as an instance method: unless you are completely sure of what you are doing, it is often cause of issues, silent errors, bugs difficult to track down and unexpected behavior (which is exactly your case); while the basic "but it works!" principle may be normally fine, remember that this is not pure python: everything we do on the Qt side from Python is affecting wrapped objects that are abstractions of their C++ counterparts;
nested classes are rarely actually needed, the only benefit they may have is to make those classes "virtually private" (but they are not really private), but the drawbacks they also cause are far more dangerous if you are not careful enough;

Note that the latter aspect if of extreme importance when dealing with bindings to other languages, things that may rely on very complex and delicate mechanisms that should not be mangled by things that could work fine and seem harmless in a normal Python semantic.

♪So what?♪

The solution is quite simple: use explicit classes and expose their methods.

If you really want to make your classes "private", then you could still use name mangling for the module, but keep doing it on top level classes and with their own methods.

Also, consider that most of the times there's no point in relying on the moveToThread() approach, at least with Python bindings: the internal QThread run() function is actually executed in its own thread and will run a separate event loop in it (which is what calls the functions/slots connected to its started signal); since Python doesn't provide real concurrency (due to the GIL), and you will normally not need an event loop at all, the most common approach is to directly subclass QThread and override run() with the actual "worker" processing.

Note that there is still a common misconception about that (notably starting from this post) for which you should "never" subclass QThread. That's not only a non-rule, but it's also a valid approach, especially in Python: you can subclass QThread and override run(). You just need to keep in mind that whatever happens in that function is run in that thread, it is potentially blocking (mostly due to the GIL), which means that there is no event loop running, including what would eventually take care of incoming signals.