c++multithreadingsynchronizationwaitcondition-variable

Setting condition variable monitor flag without a lock, is it valid?


The following code uses condition variable and a monitor flag to synchronize an operation between the main thread and thread2:

int main() {
    std::mutex m;
    std::condition_variable cv;
    std::atomic<bool> ready = false;
    std::thread thread2 = std::thread([&](){
        std::unique_lock<std::mutex> l(m);
        cv.wait(l, [&ready]{return ready.load();});
        std::cout << "Hello from thread2\n"; // 3 should print after 1
    });
    std::cout << "Hello from main thread\n"; // 1 we want this to be 1st
    ready = true; // 2, store to an atomic bool, without a lock, is it OK?
    cv.notify_one();
    thread2.join();
    std::cout << "Goodbye from main thread\n";
} 

In the code above, we use atomic<bool> for the monitor flag ready both so the read and write to this flag will not create a data race (a non-issue for most if not all platforms, but still UB "by the book") and to avoid reordering of the lines marked with 1 and 2 (the default store for atomic variable is memory_order_seq_cst which guarantees that everything that happened-before the store in this thread would be a visible side effect in the thread that performs a load for this varaible).

However, the code does not lock the modification of the ready flag (which is atomic) and the call to notify_one.

From this SO post it is clear that it is OK to leave the call to notify_one without a lock, it might even be more efficient as we do not want thread2 to be awake following the call to notify_one and then see that it should wait for a lock and be sent for a sleep by the os-scheduler, till the lock is released.

However, it is not clear whether the modification of the ready flag shall be done in a locked scope (using the same mutex used for the read), or using atomic<bool> is enough?


Solution

  • Update of the ready flag MUST be locked with the same mutex used for the read

    (And then the boolean may become a simple bool, instead of atomic<bool>).

    According to cppreference:

    Even if the shared variable is atomic, it must be modified while owning the mutex to correctly publish the modification to the waiting thread.

    This blog post is explaining quite nicely why a lock is required, and why using atomic is not enough. A similar explanation can be found in this SO post (on a related similar scenario) and in this additional SO post which lists the reasons for using a lock even with atomic variables. A very similar question is already discussed and explained also here and here.


    The problem

    Without a lock we may fall into the following race condition:

    1. Thread2 checks the ready flag, it is false, it plans to start waiting on the condition variable (by calling the basic cv.wait(lock) operation), but it is still before this call.
    2. The main thread sets ready flag to true and is quick enough to call cv.notify_one() while thread2 is not waiting on the condition variable yet.
    3. Thread2 now calls cv.wait(lock) and hangs forever, as the notification was already sent and was "lost".

    Let's prove the race condition

    To prove that the race condition, when not locking, is real, we can add a sleep in thread2 that mimics a valid timing scenario:

    std::thread thread2 = std::thread([&](){
        std::unique_lock<std::mutex> l(m);
        cv.wait(l, [&ready]{
            auto r = ready.load();
            std::this_thread::sleep_for(20ms); // timing that causes missing the event
            return r;
        });
    

    Adding this sleep actually makes thread2 to hang, QED: locking the modification of the ready flag is absolutely required, it's not just theoretical.


    Solution: lock!

    The following version solves the issue, by locking the modification of the ready flag, and we do not need now the flag to be atomic:

    int main() {
        std::mutex m;
        std::condition_variable cv;
        bool ready = false;
        std::thread thread2 = std::thread([&](){
            std::unique_lock<std::mutex> l(m);
            cv.wait(l, [&ready]{ return ready; });
            std::cout << "Hello from thread2\n";
        });
        std::cout << "Hello from main thread\n";
        // synchronization block
        {
            std::lock_guard<std::mutex> l(m);
            ready = true;
        }
        cv.notify_one();
        thread2.join();
        std::cout << "Goodbye from main thread\n";
    }
    

    How does it solve the race presented above?

    The main thread cannot set ready to true while the lock is owned by thread2. Since the lock is owned by thread2 till it is released inside the call to cv.wait(lock) (only when the wait starts the lock is released) the main thread will not be able to modify the ready flag before thread2 starts waiting on the condition variable. Thus the call in main to cv.notify_one() is guaranteed to happen when thread2 is already in wait state.


    A note on the usage of unique_lock for waiting on the conditional_variable, and lock_guard for setting ready to true: the first must use unique_lock (this is the API for wait which needs to call unlock internally), the second may use both but we go with lock_guard which is simpler (see also: std::unique_lock<std::mutex> or std::lock_guard<std::mutex>?).


    Waiting with a timeout

    It is to be noted that we may prefer waiting with a timeout (it is a general good advice to prefer waiting with a timeout, to avoid deadlocks and to have better traceability over threads status). In case we wait with a timeout we may decide to waive the locking and go back for the atomic<bool>, with something like this:

    int main() {
        std::mutex m;
        std::condition_variable cv;
        std::atomic<bool> ready = false;
        std::thread thread2 = std::thread([&](){
            std::unique_lock<std::mutex> l(m);
            // adding a timeout
            while(!cv.wait_for(l, 100ms, [&ready]{ return ready.load(); }));
            std::cout << "Hello from thread2\n";
        });
        std::cout << "Hello from main thread\n";
        ready = true; // no lock, cv.wait_for prevents us from hanging
        cv.notify_one();
        thread2.join();
        std::cout << "Goodbye from main thread\n";
    }
    

    There is of course a timing issue with this solution, as we may wait additional time (the timeout duration) for thread2 to make its operation, but if the operation is not of high priority this can be a valid solution.