c++multithreadingc++17thread-synchronization

Limited access (counter) to a resource with condition variable: how to avoid deadlock?


I have a multithreaded code accessing a limited resource (the maximum number of simultaneous thread is defined by max_accesses). The resource can be used from several part of the code, so I have to provide a tool to count and manage the access.

For the moment, I do it by the following way, but I discovered a deadlock in some random cases: sometimes, the condition_variable waits with no one to wake it up because all the accessing threads release between the decision to wait and the effective wait.

I'm not comfortable with condition variables and I don't know how to modify my procedure to avoid this deadlock. I don't have a test able to reproduce the bug with a strong probability. I prefer ask for help here and then avoid other insidious bugs.

std::mutex mutex_counter;
std::mutex mutex_wait;
std::condition_variable cv_wait;
std::size_t num_accesses = 0;
std::size_t max_accesses = 8;

bool acquire()
{
  {
    std::lock_guard<std::mutex> lock_counter(mutex_counter);
    if(num_accesses < max_accesses)
    {
      ++num_accesses;
      return true;
    }
    // If I'm here, the maximum number of accesses is reached, I have to wait
  }
  std::unique_lock<std::mutex> lock_wait(mutex_wait);
  while(true)
  {
    {
      std::lock_guard<std::mutex> lock_counter(mutex_counter);
      if(num_accesses < max_accesses)
      {
        ++num_accesses;
        return true;
      }
    }
    // Sometimes, here num_accesses == 0, because all the accesing threads just released
    // Then it's an infinite wait because there is no remaining thread to notify
    cv_wait.wait(lock_wait);
  }
}

bool release()
{
  std::lock_guard<std::mutex> lock_counter(mutex_counter);
  assert(num_accesses > 0);
  --num_accesses;
  cv_wait.notify_all();
}

Solution

  • // Sometimes, here num_accesses == 0, because all the accesing threads just released
    // Then it's an infinite wait because there is no remaining thread to notify
    

    FYI: There's a name for that problem, It's called "lost notification." You prevent it by using the same mutex for the counter and, for "waiting:"

    bool acquire() {
       std::lock_guard<std::mutex> lg{the_one_and_only_mutex};
       while (num_accesses >= max_accesses) {
    
           // Here, num_accesses STILL is greater than or equal to max_accesses
           // because the_one_and_only_mutex still is locked. No other thread 
           // could have changed it, and no other thread could have notified
           // cv_wait since the last time we tested num_accesses in this loop.
    
           cv_wait.wait(lg);
       }
    
       // Here, num_accesses STILL is less than max_accesses because
       // the_one_and_only_mutex still is locked, so no other thread
       // could have changed it since we exited the loop.
    
       ++ num_accesses;
       return true;
    }
    
    bool release() {
      std::lock_guard<std::mutex> lg(the_one_and_only_mutex);
      assert(num_accesses > 0);
      --num_accesses;
      cv_wait.notify_all();
    }