I believe I've found a bug in the Windows' Qt 5 implementation. It doesn't reproduce with Qt 6, so I don't think I should post it to Qt's maintainers now. But still I'd like to ask here (1) if it's a bug indeed (or is my code just incorrect somewhere), and (2) what workaround can I write to avoid this issue, provided that I can't upgrade to Qt 6 right now.
I have a class BackgroundExecutor
which owns a QThread
and has a function for posting new tasks (std::function
instances) to it. In its destructor, BackgroundExecutor
calls quit
and wait
member functions of its thread object.
Things get interesting when one of the posted tasks processed by the background QThread
happens to have executed some external QProcess
(I think it affects the state of the thread's QEventLoop
somehow). In this case, the wait
call on a QThread
has a chance to hang forever.
The call stack of the main thread looks like this:
ntdll.dll!NtWaitForSingleObject() Unknown
KernelBase.dll!WaitForSingleObjectEx() Unknown
> Qt5Cored.dll!QThread::wait(QDeadlineTimer deadline) Line 630 C++
QThreadAndQProcessBug.exe!BackgroundExecutor::~BackgroundExecutor() Line 87 C++
QThreadAndQProcessBug.exe!RunTest() Line 123 C++
QThreadAndQProcessBug.exe!RunTestMultipleTimes() Line 132 C++
And here's the call stack of the background thread:
win32u.dll!NtUserMsgWaitForMultipleObjectsEx() Unknown
user32.dll!RealMsgWaitForMultipleObjectsEx() Unknown
> Qt5Cored.dll!QEventDispatcherWin32::processEvents(QFlags<enum QEventLoop::ProcessEventsFlag> flags) Line 625 C++
Qt5Cored.dll!QEventLoop::processEvents(QFlags<enum QEventLoop::ProcessEventsFlag> flags) Line 140 C++
Qt5Cored.dll!QEventLoop::exec(QFlags<enum QEventLoop::ProcessEventsFlag> flags) Line 232 C++
Qt5Cored.dll!QThread::exec() Line 547 C++
Qt5Cored.dll!QThread::run() Line 617 C++
Qt5Cored.dll!QThreadPrivate::start(void * arg) Line 407 C++
It's stuck at the line 625 (as of Qt 5.15.2) of "qeventdispatcher_win.cpp", inside the QEventDispatcherWin32::processEvents
function: waitRet = MsgWaitForMultipleObjectsEx(nCount, pHandles, INFINITE, QS_ALLINPUT, MWMO_ALERTABLE | MWMO_INPUTAVAILABLE);
.
The full text of the program which reproduces the issue (though it might take some time - one of my PCs requires only 1000 iterations on average, while the other one might execute 100'000 iterations before the hang would occur):
#include <QCoreApplication>
#include <QTimer>
#include <QObject>
#include <QProcess>
#include <QThread>
#include <functional>
#include <future>
#include <memory>
#include <iostream>
Q_DECLARE_METATYPE(std::function<void()>); // for passing std::function<void()> through Qt's signals
static void EnsureStdFunctionOfVoidMetaTypeRegistered()
{
static std::once_flag std_function_metatype_registered{};
std::call_once(std_function_metatype_registered, []() {
qRegisterMetaType<std::function<void()>>("std::function<void()>");
});
}
class WorkerObject; // worker object that "lives" in a background thread of a BackgroundExecutor
class BackgroundExecutor final
{
public:
BackgroundExecutor();
~BackgroundExecutor();
// posts a new task for the background QThread,
// returns a std::future which can be waited on to ensure the task is done
[[nodiscard]] std::future<void> PostTask(std::function<void()> task);
private:
WorkerObject* _background_worker = nullptr;
QThread* _qt_thread = nullptr;
};
class WorkerObject final : public QObject
{
Q_OBJECT;
public:
WorkerObject()
{
connect(this, &WorkerObject::TaskPosted, this, &WorkerObject::ProcessPostedTask);
}
// can be called from any thread;
// "moves" the task to the background worker thread via Qt's signals/slots mechanism
// so that it could be processed there
void PostTask(const std::function<void()>& task)
{
EnsureStdFunctionOfVoidMetaTypeRegistered();
Q_EMIT TaskPosted(task);
}
private Q_SLOTS:
void ProcessPostedTask(const std::function<void()>& posted_task)
{
std::invoke(posted_task);
}
Q_SIGNALS:
void TaskPosted(const std::function<void()>&);
};
BackgroundExecutor::BackgroundExecutor()
{
{
std::unique_ptr<QThread> qt_thread_safe(new QThread()); // exception safety
_background_worker = new WorkerObject();
_qt_thread = qt_thread_safe.release();
}
_background_worker->moveToThread(_qt_thread);
QObject::connect(_qt_thread, &QThread::finished, _background_worker, &WorkerObject::deleteLater);
QObject::connect(_qt_thread, &QThread::finished, _qt_thread, &QThread::deleteLater);
_qt_thread->start();
}
BackgroundExecutor::~BackgroundExecutor()
{
_qt_thread->quit();
_qt_thread->wait(); // !!! might hang !!!
}
[[nodiscard]] std::future<void> BackgroundExecutor::PostTask(std::function<void()> task)
{
std::shared_ptr task_promise = std::make_shared<std::promise<void>>();
std::future task_future = task_promise->get_future();
std::function<void()> task_wrapper = [task_promise = std::move(task_promise), task = std::move(task)]()
{
std::invoke(task);
task_promise->set_value();
};
_background_worker->PostTask(task_wrapper);
return task_future;
}
static void RunQProcessAndWaitForFinished()
{
QProcess process;
process.setProgram("C:\\Windows\\System32\\cmd.exe");
process.setArguments({ "/C", "C:\\Windows\\System32\\timeout.exe", QString::number(30) });
process.start();
process.waitForStarted(-1);
process.waitForFinished(-1);
}
static void RunTest()
{
BackgroundExecutor executor;
std::future task_future = executor.PostTask([]() {
RunQProcessAndWaitForFinished();
});
task_future.get();
}
static void RunTestMultipleTimes()
{
constexpr int repeat = 500'000;
for (int i = 0; i < repeat; ++i)
{
std::cout << "starting iteration " << i << '\n';
RunTest();
}
std::cout << "all iterations finished" << '\n';
}
int main(int argc, char** argv)
{
QCoreApplication qt_app{ argc, argv };
QTimer::singleShot(
0,
[&]()
{
RunTestMultipleTimes();
qt_app.exit(0);
});
return qt_app.exec();
}
#include "main.moc"
After posting this question, I also ran some tests with Qt 5.12.12 and Qt 5.15.5, as suggested in the comments. I could reproduce the bug in neither of these versions, while in Qt 5.15.2 it continues to reproduce consistently. This could either mean that the bug is really absent in these versions, or that it is much more rarely reproducible there (at least on my PCs) and I didn't try enough.
In any case, I would like to share a workaround which I came up with after consulting one of my colleagues who has more experience with WinAPI than I do. Hope it will also help someone else in the future, if they happen to be stuck with the Qt version where this bug is present.
Here's the code (you can compare it with the original listing from the question using a tool such as WinMerge, or just look at all the parts near the WORKAROUND_FOR_QTHREAD_WAIT
macro):
#include <QCoreApplication>
#include <QTimer>
#include <QObject>
#include <QProcess>
#include <QThread>
#include <atomic>
#include <thread>
#include <functional>
#include <future>
#include <memory>
#include <mutex>
#include <cassert>
#include <iostream>
#ifdef _WIN32
#define WORKAROUND_FOR_QTHREAD_WAIT 1
#define WIN32_LEAN_AND_MEAN
#define NOMINMAX
#include <Windows.h>
#endif // _WIN32
Q_DECLARE_METATYPE(std::function<void()>); // for passing std::function<void()> through Qt's signals
static void EnsureStdFunctionOfVoidMetaTypeRegistered()
{
static std::once_flag std_function_metatype_registered{};
std::call_once(std_function_metatype_registered, []() {
qRegisterMetaType<std::function<void()>>("std::function<void()>");
});
}
class WorkerObject; // worker object that "lives" in a background thread of a BackgroundExecutor
class BackgroundExecutor final
{
public:
BackgroundExecutor();
~BackgroundExecutor();
// posts a new task for the background QThread,
// returns a std::future which can be waited on to ensure the task is done
[[nodiscard]] std::future<void> PostTask(std::function<void()> task);
private:
WorkerObject* _background_worker = nullptr;
QThread* _qt_thread = nullptr;
};
class WorkerObject final : public QObject
{
Q_OBJECT;
public:
WorkerObject()
{
connect(this, &WorkerObject::TaskPosted, this, &WorkerObject::ProcessPostedTask);
}
// can be called from any thread;
// "moves" the task to the background worker thread via Qt's signals/slots mechanism
// so that it could be processed there
void PostTask(const std::function<void()>& task)
{
EnsureStdFunctionOfVoidMetaTypeRegistered();
Q_EMIT TaskPosted(task);
}
#if WORKAROUND_FOR_QTHREAD_WAIT
[[nodiscard]] DWORD GetBackgroundThreadID() const
{
while (_windows_thread_id_initialized.load(std::memory_order_acquire) == false)
std::this_thread::yield();
return _windows_thread_id;
}
#endif // WORKAROUND_FOR_QTHREAD_WAIT
public Q_SLOTS:
void InitializeBackgroundThreadID()
{
#if WORKAROUND_FOR_QTHREAD_WAIT
_windows_thread_id = ::GetCurrentThreadId();
_windows_thread_id_initialized.store(true, std::memory_order_release);
#else
assert(false && "shouldn't have been invoked in this configuration");
#endif // WORKAROUND_FOR_QTHREAD_WAIT
}
void ProcessPostedTask(const std::function<void()>& posted_task)
{
std::invoke(posted_task);
}
Q_SIGNALS:
void TaskPosted(const std::function<void()>&);
#if WORKAROUND_FOR_QTHREAD_WAIT
private:
DWORD _windows_thread_id;
std::atomic<bool> _windows_thread_id_initialized{ false };
#endif // WORKAROUND_FOR_QTHREAD_WAIT
};
BackgroundExecutor::BackgroundExecutor()
{
{
std::unique_ptr<QThread> qt_thread_safe(new QThread()); // exception safety
_background_worker = new WorkerObject();
_qt_thread = qt_thread_safe.release();
}
_background_worker->moveToThread(_qt_thread);
#if WORKAROUND_FOR_QTHREAD_WAIT
QMetaObject::invokeMethod(_background_worker, "InitializeBackgroundThreadID", Qt::QueuedConnection);
#endif // WORKAROUND_FOR_QTHREAD_WAIT
QObject::connect(_qt_thread, &QThread::finished, _background_worker, &WorkerObject::deleteLater);
QObject::connect(_qt_thread, &QThread::finished, _qt_thread, &QThread::deleteLater);
_qt_thread->start();
}
BackgroundExecutor::~BackgroundExecutor()
{
_qt_thread->quit();
#if WORKAROUND_FOR_QTHREAD_WAIT
const DWORD background_thread_id = _background_worker->GetBackgroundThreadID();
while (_qt_thread->wait(1'000) == false)
// "awaken" the MsgWaitForMultipleObjectsEx function call in which the background thread got stuck
// by posting a fake "message" to it, so that it would snap out of it, check its exit flag and finish properly
::PostThreadMessage(background_thread_id, WM_NULL, 0, 0);
#else
_qt_thread->wait();
#endif // WORKAROUND_FOR_QTHREAD_WAIT
}
[[nodiscard]] std::future<void> BackgroundExecutor::PostTask(std::function<void()> task)
{
std::shared_ptr task_promise = std::make_shared<std::promise<void>>();
std::future task_future = task_promise->get_future();
std::function<void()> task_wrapper = [task_promise = std::move(task_promise), task = std::move(task)]()
{
std::invoke(task);
task_promise->set_value();
};
_background_worker->PostTask(task_wrapper);
return task_future;
}
static void RunQProcessAndWaitForFinished()
{
QProcess process;
process.setProgram("C:\\Windows\\System32\\cmd.exe");
process.setArguments({ "/C", "C:\\Windows\\System32\\timeout.exe", QString::number(30) });
process.start();
process.waitForStarted(-1);
process.waitForFinished(-1);
}
static void RunTest()
{
BackgroundExecutor executor;
std::future task_future = executor.PostTask([]() {
RunQProcessAndWaitForFinished();
});
task_future.get();
}
static void RunTestMultipleTimes()
{
constexpr int repeat = 500'000;
for (int i = 0; i < repeat; ++i)
{
std::cout << "starting iteration " << i << '\n';
RunTest();
}
std::cout << "all iterations finished" << '\n';
}
int main(int argc, char** argv)
{
QCoreApplication qt_app{ argc, argv };
QTimer::singleShot(
0,
[&]()
{
RunTestMultipleTimes();
qt_app.exit(0);
});
return qt_app.exec();
}
#include "main.moc"
And now for the implementation details.
The basic idea is: the MsgWaitForMultipleObjectsEx(..., QS_ALLINPUT, ...)
function call (in which the background thread gets stuck) can be "awakened" by a message posted to this thread's queue, since QS_ALLINPUT
also implies QS_POSTMESSAGE
as a part of its "wake mask". This can be done by calling PostThreadMessage
from the destroying (main) thread in case we've detected that the background one got stuck. And since we don't care about a particular type of a message (and actually don't want to send any "meaningful" message), WM_NULL
will do the trick.
So, the destructor of our BackgroundExecutor
class should look like this:
BackgroundExecutor::~BackgroundExecutor()
{
_qt_thread->quit();
const DWORD background_thread_id = _background_worker->GetBackgroundThreadID();
while (_qt_thread->wait(1'000) == false)
::PostThreadMessage(background_thread_id, WM_NULL, 0, 0);
}
Now the question is, how do we get this DWORD background_thread_id
value. Qt doesn't provide us an easy way of getting it from a QThread
object (there is a function currentThreadId
, but it is static
and returns a DWORD
ID of the currently executing thread, thus it's not what we want here). Instead, we will call GetCurrentThreadId
from the background thread at some early point of its execution, store it in our WorkerObject
, and retrieve in the main thread later.
In order to do that, we can write a new slot InitializeBackgroundThreadID
in the WorkerObject
class and invoke it via Qt::QueuedConnection
from the BackgroundExecutor
's constructor after moving the worker object to the background thread (thus ensuring that the slot will be invoked from that thread):
public Q_SLOTS:
void InitializeBackgroundThreadID()
{
_windows_thread_id = ::GetCurrentThreadId();
_windows_thread_id_initialized.store(true, std::memory_order_release);
}
(here, _windows_thread_id
is a DWORD
member variable of the WorkerObject
, while _windows_thread_id_initialized
is a std::atomic<bool>
guard flag required for cross-thread synchronization via the synchronizes-with relation)
In BackgroundExecutor
's constructor:
...
_background_worker->moveToThread(_qt_thread);
QMetaObject::invokeMethod(_background_worker, "InitializeBackgroundThreadID", Qt::QueuedConnection);
...
Now we can implement the getter for this thread ID:
[[nodiscard]] DWORD GetBackgroundThreadID() const
{
while (_windows_thread_id_initialized.load(std::memory_order_acquire) == false)
std::this_thread::yield();
return _windows_thread_id;
}
In practice, by the time the BackgroundExecutor
's destructor is called, the thread ID will have been already initialized, so the loop won't spin or call yield
at all: it is just needed to guarantee the synchronizes-with relation (the main thread will be guaranteed to read the correct value of _windows_thread_id
after it has read true
from _windows_thread_id_initialized
via a load-acquire operation).
There can be other means of implementing this part, e.g., we could make the DWORD _windows_thread_id
member variable an atomic
itself, providing it some "invalid" sentry value for the "uninitialized" state (Raymond Chen once wrote zero should be okay for that). But these are just the implementation details, the basic idea still remains the same.