c++multithreadingsynchronizationmutexatomic

Can std::atomic::wait be used to replace the mutex?


I had this doubt after reading the question C++20 mutex with atomic wait and also Can std::atomic be used sometimes instead of std::mutex in C++? although similar, and my doubt is different, I will explain it below:

I want to know if it is safe to call the std::atomic::wait function to synchronize a piece of code, as many have suggested, it appears that it is possible to use std::atomic::wait, but I believe that perhaps it is not completely safe with risk of a dead-lock occurring

Imagine the following code below:

#include <atomic>
#include <iostream>
#include <thread>
#include <vector>

std::atomic<bool> lock = false;

void go(){
    bool expected = false;
    while(!lock.compare_exchange_strong(expected, true)) {
        lock.wait(true);
        expected = false;
    }

    std::cout << "Sincronizado" << std::endl;

    lock = false;

    lock.notify_all();
}

int main() {
    std::vector<std::thread> threads;
    threads.reserve(2);

    for(int i = 0; i < 2; i++)
        threads.emplace_back(go);

    for(auto& thread : threads)
        thread.join();
}

So far so good, you may have already seen it in several Stackoverflow questions, but now think about the following scenario:

Thread 1:

+ Win the competition while(!lock.compare_exchange_strong(expected, true))
+ Get to the line before executing lock = false;
+ Thread 1 is preempted

Thread 2:

+ Lose the competition while(!lock.compare_exchange_strong(expected, true))
+ Start executing the line lock.wait(true);
+ Inside the line (which internally uses Futex) it checks if that lock is true
+ Still within the line, before blocking begins (FUTEX_WAIT) the thread is preempted

Thread 1:

+ Return to execution, and set lock to false.
+ Notifies all pending threads lock.notify_all();

Thread 2:

+ Return to execution, blocking begins (FUTEX_WAIT)
+ A possible dead-lock?

Is this scenario possible or am I being paranoid?

I apologize for any confusion, I'm using Google Translate, if you have any questions, please use the comments.

WAIT !

If my question is incomplete, or is formatted in an inappropriate way, please comment so I can adjust it, thank you in advance, in advance.

Lucas P.


Solution

  • For this exact reason, the FUTEX_WAIT system call takes the expected "old" value as an argument. Before going to sleep, it checks to ensure the futex word still contains this value, and if not, it returns immediately instead of sleeping.

    From the futex(2) man page:

    FUTEX_WAIT (since Linux 2.6.0)

    This operation tests that the value at the futex word pointed to by the address uaddr still contains the expected value val, and if so, then sleeps waiting for a FUTEX_WAKE operation on the futex word. The load of the value of the futex word is an atomic memory access (i.e., using atomic machine instructions of the respective architecture). This load, the comparison with the expected value, and starting to sleep are performed atomically and totally ordered with respect to other futex operations on the same futex word. If the thread starts to sleep, it is considered a waiter on this futex word. If the futex value does not match val, then the call fails immediately with the error EAGAIN.

    The purpose of the comparison with the expected value is to prevent lost wake-ups. If another thread changed the value of the futex word after the calling thread decided to block based on the prior value, and if the other thread executed a FUTEX_WAKE operation (or similar wake-up) after the value change and before this FUTEX_WAIT operation, then the calling thread will observe the value change and will not start to sleep.

    So in your example, Thread 2 will enter the FUTEX_WAIT system call with val == true. Since the actual value of lock at this point is false, the futex system call will notice this, and return without going to sleep. The lock.wait() will then likewise return, giving Thread 2 another chance to execute the compare_exchange and try to take the lock for itself.

    Presumably the kernel implements this by tentatively adding the current thread to a list of threads waiting on the futex, and then (after a memory barrier) reloading the futex word. If it still equals val then we put the thread to sleep. If not, we remove it from the list of waiters and return.

    Thus, assuming another thread is going to modify the value and then call FUTEX_WAKE, we are safe either way. If we observe the value val, we know that any store of another value happened after our load. Since we added ourself to the list of waiters before the load, and since any wake event would be signaled after the store, this proves that the wake event would occur after we were on the list of waiters, and so we would not miss it.