c++multithreadingsingletonweak-ptr

weak_ptr to singleton not thread-safe


I'm writing a function that returns a shared_ptr to a singleton. I want the singleton object to be destroyed when all of the references have gone away. My solution builds on this accepted answer which uses a static weak_ptr and mutex, but my test for its thread-safety does not pass consistently.

Here is a complete example that illustrates the problem.

#include <gtest/gtest.h>
#include <atomic>
#include <mutex>
#include <memory>
#include <vector>
#include <thread>

using namespace std;

// Number of instances of this class are tracked in a static counter.
// Used to verify the get_singleton logic (below) is correct.
class CountedObject {
private:
    static atomic<int> instance_counter;

public:
    CountedObject() {
        int prev_counter = instance_counter.fetch_add(1);
        if (prev_counter != 0)
            // Somehow, 2 objects exist at the same time. Why?
            throw runtime_error("Constructed " + to_string(prev_counter + 1) +
                                " counted objects");
    }
    ~CountedObject() {
        instance_counter.fetch_sub(1);
    }
    static int count() {
        return instance_counter.load();
    }
};

atomic<int> CountedObject::instance_counter{0};

// Returns reference to a singleton that gets destroyed when all references
// are destroyed.
template <typename T>
std::shared_ptr<T> get_singleton() {
    static mutex mtx;
    static weak_ptr<T> weak;
    scoped_lock lk(mtx);
    shared_ptr<T> shared = weak.lock();
    if (!shared) {
        shared.reset(new T, [](T* ptr){ 
            scoped_lock lk(mtx);
            delete ptr;
        });
        weak = shared;
    }
    return shared;
}

// This test passes consistently.
TEST(GetSingletonTest, SingleThreaded) {
    ASSERT_EQ(CountedObject::count(), 0);

    auto ref1 = get_singleton<CountedObject>();
    auto ref2 = get_singleton<CountedObject>();
    ASSERT_EQ(CountedObject::count(), 1);

    ref1.reset();
    ASSERT_EQ(CountedObject::count(), 1);
    ref2.reset();
    ASSERT_EQ(CountedObject::count(), 0);
}

// This test does NOT pass consistently.
TEST(GetSingletonTest, MultiThreaded) {
    const int THREAD_COUNT = 2;
    const int ITERS = 1000;
    vector<thread> threads;
    for (int i = 0; i < THREAD_COUNT; ++i)
        threads.emplace_back([ITERS]{
            // Repeatedly obtain and release references to the singleton.
            // The invariant must hold that at most one instance ever exists
            // at a time.
            for (int j = 0; j < ITERS; ++j) {
                auto local_ref = get_singleton<CountedObject>();
                local_ref.reset();
            }
        });
    for (auto& t : threads)
        t.join();
}

On my system (ARM64 Linux, g++ 7.5.0), the multi-threaded test usually fails:

[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from GetSingletonTest
[ RUN      ] GetSingletonTest.SingleThreaded
[       OK ] GetSingletonTest.SingleThreaded (0 ms)
[ RUN      ] GetSingletonTest.MultiThreaded
terminate called after throwing an instance of 'std::runtime_error'
  what():  Constructed 2 counted objects
Aborted (core dumped)

I've omitted cout messages from the code for brevity, but separately I've added messages to debug what's happening:

[thread id,  message]
...
547921465808 Acquired lock
547921465808 Weak ptr expired, constructing
547921465808 Releasing lock
547929858512 Acquired lock
547929858512 Weak ptr expired, constructing
terminate called after throwing an instance of 'std::runtime_error'
  what():  Constructed 2 counted objects
Aborted (core dumped)

It appears that thread 1 determines the singleton has expired and re-constructs it. Then thread 2 wakes up and also determines the singleton has expired, even though thread 1 has just re-populated it - as if thread 2 was working with an "out-of-date" version of the weak pointer.

How can I make the assignment to weak from thread 1 immediately visible to thread 2? Can this be achieved with C++20's atomic<weak_ptr<T>>? My project is restricted to C++17, so unfortunately that is not an option for me.

I appreciate your help!


Solution

  • The assignment under lock will be visible to any other thread checking the weak_ptr under lock.

    The reason multiple instances exist is that the destructor order of operations doesn't guarantee otherwise. !weak.lock() isn't equivalent to "with certainty, there are no instances of this object". It is equivalent to the inverse of "with certainty, there is an instance of this object". There may be an instance, because the reference count is decremented before the deleter has a chance to act.

    Imagine this sequence:

    EDIT:

    To accomplish this, you need some indicator that there is no instance. That must be reset only after the instance is destroyed. And second, you need to decide what to do if you encounter this case where an instance exists but is being destroyed:

    template <typename T>
    std::shared_ptr<T> get_singleton() {
        static mutex mtx;
        static weak_ptr<T> weak;
        static bool instance;
    
        while (true) {
            scoped_lock lk(mtx);
    
            shared_ptr<T> shared = weak.lock();
            if (shared)
                return shared;
    
            if (instance)
                // allow deleters to make progress and try again
                continue;
    
            instance = true;
            shared.reset(new T, [](T* ptr){ 
                scoped_lock lk(mtx);
                delete ptr;
                instance = false;
            });
            weak = shared;
            return shared;
        }
    }
    

    To put it a different way, there are these states in the state machine:

    1. no T exists
    2. T is being constructed, but can't yet be accessed
    3. T exists and is accessible
    4. T is being destroyed, and can no longer be accessed

    The bool lets us know that we are in state #4. One approach is to loop until we are no longer in state #4. We drop the lock so that deleters can make progress. When we retake the lock, another thread may have swooped in and created the instance, so we need to start at the top and re-check everything.