I have a pool of threads (QueueWorkers
class) in my program that are released using this logic:
int QueueWorkers::stop()
{
for (unsigned int ix = 0; ix < threadIds.size(); ++ix)
{
pthread_cancel(threadIds[ix]);
pthread_join(threadIds[ix], NULL);
}
return 0;
}
where threadIds
is a class variable of type std::vector<pthread_t>
.
This logic works most of the times but I have checked testing that it fails with some probability. In particular, sometimes after the execution of pthread_cancel
the pthread_join
statement in the next line never returns and my program hangs.
As far as I understand until now, using pthread_join
on a cancelled thread should always return. Are there any circumstances that could be avoiding this or any way of debugging what can be going on here? Is my approach to release threads upon termination the right one?
Additional information: Threads have a cancellation handler (registered using pthread_cleanup_push
) which frees dynamic memory used by the thread to avoid leaks. Under normal circumstances, the handler is called upon pthread_cancel
and works fine, but the time pthread_join
fails returning I have checked that the cancellation handler is not invoked.
Thanks in advance!
EDIT: as suggested in question comments, I have modified my code to check the returned value of pthread_cancel
. It's always 0, no matter if after that pthread_join
works as expected or not.
EDIT2: as requested in some comment to this question, let me provide more detail of how it works.
The pool of threads is initialized by the start()
method:
int QueueWorkers::start()
{
// numberOfThreads and pQueue are class variables
for (int i = 0; i < numberOfThreads; ++i)
{
pthread_t tid;
pthread_create(&tid, NULL, workerFunc, pQueue);
threadIds.push_back(tid);
}
return 0;
}
The start function workerFunc()
is as follows (simplified):
static void* workerFunc(void* pQueue)
{
// Initialize some dynamic objects (Foo for simplification)
Foo* foo = initFoo();
// Set pthread_cancel handler
pthread_cleanup_push(workerFinishes, foo);
// Loop forever
for (;;)
{
// Wait for new item to process on pQueue
... paramsV = ((Queue*) pQueue)->pop();
// Then process it
...
}
// Next statemement never executes but compilation breaks without it. See this note in pthread.h:
// "pthread_cleanup_push and pthread_cleanup_pop are macros and must always be used in
// matching pairs at the same nesting level of braces".
pthread_cleanup_pop(0);
}
Note the pthread_cleanup_push()
statement before starting the ethernal loop. This is done to implement the cleanup logic upon cancellation for the Foo
object:
static void workerFinishes(void* curl)
{
freeFoo((Foo*) curl);
}
I hope not having over-simplified the code. In any case, you can see the original version here.
Are sure the thread is in a cancelation or your thread cancelation_type
is asynchronous?
From man
of pthread_cancel
:
A thread's cancellation type, determined by pthread_setcanceltype(3), may be either asynchronous or deferred (the default for new threads). Asynchronous cancelability means that the thread can be canceled at any time (usually immediately, but the system does not guarantee this). Deferred cancelability means that cancellation will be delayed until the thread next calls a function that is a cancellation point. A list of functions that are or may be cancellation points is provided in pthreads(7).
I don't think canceling threads is the best ways to make sure that a thread will finish. Perhaps you can send the thread a message that it should stop and make sure the thread does receive the message and will handle it.