I'm having a dead-lock when trying to notify a condition_variable
from a thread.
Here is my MCVE:
#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
static boost::mutex m_mutex;
static boost::condition_variable m_cond;
void threadFunc()
{
std::cout << "LOCKING MUTEX" << std::endl;
boost::mutex::scoped_lock lock( m_mutex );
std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
m_cond.notify_all();
std::cout << "NOTIFIED" << std::endl;
}
int main( int argc, char* argv[] )
{
while( true )
{
std::cout << "TESTING!!!" << std::endl;
boost::mutex::scoped_lock lock( m_mutex );
boost::thread thrd( &threadFunc );
//m_cond.wait( lock );
while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) )
{
std::cout << "WAITING..." << std::endl;
}
static int pos = 0;
std::cout << "DONE!!! " << pos++ << std::endl;
thrd.join();
}
return 0;
}
If using m_cond.wait( lock );
, I see DONE!!!
being written for every attempt, no problem here.
If I use the while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) )
loop, I see DONE!!!
being written for a few attempts, and, at some point, I get a dead lock and waiting finally never ends:
TESTING!!!
LOCKING MUTEX
LOCKED, NOTIFYING CONDITION
NOTIFIED
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
...
I have read other posts on stackoverflow (like Condition variable deadlock): they mention that this could happen if notify_all
is called before condition's wait function is running, so the mutex must be used to prevent that. But I feel like that's what I'm doing:
m_cond.timed_wait
is reached (and then mutex is unlocked)timed_wait
relocks the mutex so notify cannot be done, we print "WITTING..." and we release the mutex when we are again ready to receive the notificationSo why is the dead-lock occuring? Could the condition be notified between the moment when timed_wait
detects the timeout and relock the mutex?
The problem is that if timed_wait
completes before notify_all
is called it will then have to wait for the thread to release the mutex (i.e. after it has called notify_all
) before it resumes then will call timed_wait
again, the thread has finished so timed_wait
will never succeed. There are two scenarios where this can happen, if your thread takes more than a millisecond to start (should be unlikely but the scheduling vagaries of your OS mean it could happen, especially if the CPU is busy) the other is spurious wakeups.
Both scenarios can be guarded against by setting a flag when calling notify_all
which the waiting thread can check to ensure notify has been called:
#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
static boost::mutex m_mutex;
static boost::condition_variable m_cond;
void threadFunc(bool& notified)
{
std::cout << "LOCKING MUTEX" << std::endl;
boost::mutex::scoped_lock lock(m_mutex);
std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
notified = true;
m_cond.notify_all();
std::cout << "NOTIFIED" << std::endl;
}
int main(int argc, char* argv[])
{
while (true)
{
std::cout << "TESTING!!!" << std::endl;
boost::mutex::scoped_lock lock(m_mutex);
bool notified = false;
boost::thread thrd(&threadFunc, boost::ref(notified));
//m_cond.wait( lock );
std::cout << "WAITING..." << std::endl;
while (!m_cond.timed_wait(lock, boost::posix_time::milliseconds(1), [&] { return notified; }))
{
std::cout << "WAITING..." << std::endl;
}
static int pos = 0;
std::cout << "DONE!!! " << pos++ << std::endl;
thrd.join();
}
return 0;
}