c++multithreadingboost-threadcondition-variable

dead-lock with condition_variable


I'm having a dead-lock when trying to notify a condition_variable from a thread.

Here is my MCVE:

#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>

static boost::mutex m_mutex;
static boost::condition_variable m_cond;

void threadFunc()
{
    std::cout << "LOCKING MUTEX" << std::endl;
    boost::mutex::scoped_lock lock( m_mutex );
    std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
    m_cond.notify_all();
    std::cout << "NOTIFIED" << std::endl;
}

int main( int argc, char* argv[] )
{
    while( true )
    {
        std::cout << "TESTING!!!" << std::endl;

        boost::mutex::scoped_lock lock( m_mutex );

        boost::thread thrd( &threadFunc );

        //m_cond.wait( lock );
        while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) )
        {
            std::cout << "WAITING..." << std::endl;
        }

        static int pos = 0;
        std::cout << "DONE!!! " << pos++ << std::endl;

        thrd.join();
    }

    return 0;
}

If using m_cond.wait( lock );, I see DONE!!! being written for every attempt, no problem here.

If I use the while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) ) loop, I see DONE!!! being written for a few attempts, and, at some point, I get a dead lock and waiting finally never ends:

TESTING!!!
LOCKING MUTEX
LOCKED, NOTIFYING CONDITION
NOTIFIED
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
...

I have read other posts on stackoverflow (like Condition variable deadlock): they mention that this could happen if notify_all is called before condition's wait function is running, so the mutex must be used to prevent that. But I feel like that's what I'm doing:

So why is the dead-lock occuring? Could the condition be notified between the moment when timed_wait detects the timeout and relock the mutex?


Solution

  • The problem is that if timed_wait completes before notify_all is called it will then have to wait for the thread to release the mutex (i.e. after it has called notify_all) before it resumes then will call timed_wait again, the thread has finished so timed_wait will never succeed. There are two scenarios where this can happen, if your thread takes more than a millisecond to start (should be unlikely but the scheduling vagaries of your OS mean it could happen, especially if the CPU is busy) the other is spurious wakeups.

    Both scenarios can be guarded against by setting a flag when calling notify_all which the waiting thread can check to ensure notify has been called:

    #include <iostream>
    #include <boost/thread.hpp>
    #include <boost/thread/mutex.hpp>
    #include <boost/thread/condition_variable.hpp>
    
    static boost::mutex m_mutex;
    static boost::condition_variable m_cond;
    
    void threadFunc(bool& notified)
    {
        std::cout << "LOCKING MUTEX" << std::endl;
        boost::mutex::scoped_lock lock(m_mutex);
        std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
        notified = true;
        m_cond.notify_all();
        std::cout << "NOTIFIED" << std::endl;
    }
    
    int main(int argc, char* argv[])
    {
        while (true)
        {
            std::cout << "TESTING!!!" << std::endl;
    
            boost::mutex::scoped_lock lock(m_mutex);
    
            bool notified = false;
    
            boost::thread thrd(&threadFunc, boost::ref(notified));
    
            //m_cond.wait( lock );
            std::cout << "WAITING..." << std::endl;
            while (!m_cond.timed_wait(lock, boost::posix_time::milliseconds(1), [&] { return notified; }))
            {
                std::cout << "WAITING..." << std::endl;
            }
    
            static int pos = 0;
            std::cout << "DONE!!! " << pos++ << std::endl;
    
            thrd.join();
        }
    
        return 0;
    }