c++segmentation-faultshared-ptrboost-timer

Segmentation fault on class destruction with boost::timer as a member of the class with periodic invocation


I'm working on a simple class which upon creation schedules a periodic timer for invoking one of its' methods. The method is virtual, so that derived classes can overload it with whatever periodic work they need.

In my test of this class, however, I randomly experience segmentation fault and can't figure out why. Here's the code and example of good and bad outputs:

#include <boost/thread/mutex.hpp>
#include <boost/thread/lock_guard.hpp>
#include <boost/asio/steady_timer.hpp>
#include <boost/chrono.hpp>
#include <boost/enable_shared_from_this.hpp>
#include <boost/function.hpp>
#include <boost/atomic.hpp>
#include <boost/make_shared.hpp>
#include <boost/bind.hpp>

//******************************************************************************
class PeriodicImpl;
class Periodic {
public:
    Periodic(boost::asio::io_service& io, unsigned int periodMs);
    ~Periodic();

    virtual unsigned int periodicInvocation() = 0;

private:
    boost::shared_ptr<PeriodicImpl> pimpl_;
};

//******************************************************************************
class PeriodicImpl : public boost::enable_shared_from_this<PeriodicImpl> 
{
public:
    PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
        boost::function<unsigned int(void)> workFunc);
    ~PeriodicImpl();

    void setupTimer(unsigned int intervalMs);

    boost::atomic<bool> isRunning_;
    unsigned int periodMs_;
    boost::asio::io_service& io_;
    boost::function<unsigned int(void)> workFunc_;
    boost::asio::steady_timer timer_;
};

//******************************************************************************
Periodic::Periodic(boost::asio::io_service& io, unsigned int periodMs):
pimpl_(boost::make_shared<PeriodicImpl>(io, periodMs, boost::bind(&Periodic::periodicInvocation, this)))
{
    std::cout << "periodic ctor " << pimpl_.use_count() << std::endl;
    pimpl_->setupTimer(periodMs);
}

Periodic::~Periodic()
{
    std::cout << "periodic dtor " << pimpl_.use_count() << std::endl;
    pimpl_->isRunning_ = false;
    pimpl_->timer_.cancel();
    std::cout << "periodic dtor end " << pimpl_.use_count() << std::endl;
}

//******************************************************************************
PeriodicImpl::PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
    boost::function<unsigned int(void)> workFunc):
isRunning_(true), 
io_(io), periodMs_(periodMs), workFunc_(workFunc), timer_(io_)
{
}

PeriodicImpl::~PeriodicImpl()
{
    std::cout << "periodic impl dtor" << std::endl;
}

void
PeriodicImpl::setupTimer(unsigned int intervalMs)
{
    std::cout << "schedule new " << intervalMs << std::endl;
    boost::shared_ptr<PeriodicImpl> self(shared_from_this());

    timer_.expires_from_now(boost::chrono::milliseconds(intervalMs));
    timer_.async_wait([self, this](const boost::system::error_code& e){
        std::cout << "hello invoke" << std::endl;
        if (!e)
        {
            if (isRunning_)
            {
                std::cout << "invoking" << std::endl;
                unsigned int nextIntervalMs = workFunc_();
                if (nextIntervalMs)
                    setupTimer(nextIntervalMs);
            }
            else
                std::cout << "invoke not running" << std::endl;
        }
        else
            std::cout << "invoke cancel" << std::endl;
    });

    std::cout << "scheduled " << self.use_count() << std::endl;
}

//******************************************************************************
class PeriodicTest : public Periodic
{
public:
    PeriodicTest(boost::asio::io_service& io, unsigned int periodMs):
        Periodic(io, periodMs), periodMs_(periodMs), workCounter_(0){}
    ~PeriodicTest(){
        std::cout << "periodic test dtor" << std::endl;
    }

    unsigned int periodicInvocation() {
        std::cout << "invocation " << workCounter_ << std::endl;
        workCounter_++;
        return periodMs_;
    }

    unsigned int periodMs_;
    unsigned int workCounter_;
};

//******************************************************************************
void main()
{
    boost::asio::io_service io;
    boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
    boost::thread t([&io](){
        io.run();
    });
    unsigned int workCounter = 0;

    {
        PeriodicTest p(io, 50);
        boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
        workCounter = p.workCounter_;
    }
    work.reset();
    //EXPECT_EQ(10, workCounter);
}

Good output:

hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
hello invoke
invoke cancel
periodic impl dtor

Bad output:

hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
periodic impl dtor
Segmentation fault: 11

Apparently, segmentation fault is happening because PeriodicImpl is destructed so as its' timer timer_. But timer is still scheduled - and this leads to SEGFAULT. I can't understand why PeriodicImpl destructor is called in this case, because a shared_ptr to PeriodicImpl was copied to lambda passed as the timer's handler function during setupTimer call and this should've retained a copy of PeriodicImpl and prevent destructor invocation.

Any ideas?


Solution

  • The problem turned out to be entirely not in the questioned code, but in the code that tested it.

    I enabled saving core dump file by running ulimit -c unlimited and then used lldb to read it:

    $ lldb bin/tests/test-segment-controller -c /cores/core.75876
    (lldb) bt all
    * thread #1: tid = 0x0000, 0x00007fff8eb800f9 libsystem_malloc.    dylib`szone_malloc_should_clear + 2642, stop reason = signal SIGSTOP
      * frame #0: 0x00007fff8eb800f9 libsystem_malloc.dylib`szone_malloc_should_clear     + 2642
        frame #1: 0x00007fff8eb7f667 libsystem_malloc.dylib`malloc_zone_malloc + 71
        frame #2: 0x00007fff8eb7e187 libsystem_malloc.dylib`malloc + 42
        frame #3: 0x00007fff9569923e libc++abi.dylib`operator new(unsigned long) + 30
        frame #4: 0x000000010da4b516 test-periodic`testing::Message::Message(    this=0x00007fff521e8450) + 38 at gtest.cc:946
        frame #5: 0x000000010da4a645 test-periodic`testing::Message::Message(    this=0x00007fff521e8450) + 21 at gtest.cc:946
        frame #6: 0x000000010da6c027 test-periodic`std::string     testing::internal::StreamableToString<long long>(streamable=0x00007fff521e84b0)     + 39 at gtest-message.h:244
        frame #7: 0x000000010da558e8 test-    periodic`testing::internal::PrettyUnitTestResultPrinter::OnTestEnd(    this=0x00007fe733421570, test_info=0x00007fe7334211c0) + 216 at gtest.cc:3141
        frame #8: 0x000000010da56a28 test-    periodic`testing::internal::TestEventRepeater::OnTestEnd(    this=0x00007fe733421520, parameter=0x00007fe7334211c0) + 136 at gtest.cc:3321
        frame #9: 0x000000010da53957 test-periodic`testing::TestInfo::Run(    this=0x00007fe7334211c0) + 343 at gtest.cc:2667
        frame #10: 0x000000010da540c7 test-periodic`testing::TestCase::Run(    this=0x00007fe733421660) + 231 at gtest.cc:2774
        frame #11: 0x000000010da5b5d6 test-    periodic`testing::internal::UnitTestImpl::RunAllTests(this=0x00007fe733421310)     + 726 at gtest.cc:4649
        frame #12: 0x000000010da83263 test-periodic`bool     testing::internal::HandleSehExceptionsInMethodIfSupported<    testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310,     method=0x000000010da5b300, location="auxiliary test code (environments or     event listeners)")(), char const*) + 131 at gtest.cc:2402
        frame #13: 0x000000010da6cde1 test-periodic`bool     testing::internal::HandleExceptionsInMethodIfSupported<    testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310,     method=0x000000010da5b300, location="auxiliary test code (environments or     event listeners)")(), char const*) + 113 at gtest.cc:2438
        frame #14: 0x000000010da5b2a2 test-periodic`testing::UnitTest::Run(    this=0x000000010dab18e8) + 210 at gtest.cc:4257
        frame #15: 0x000000010da19541 test-periodic`RUN_ALL_TESTS() + 17 at gtest.    h:2233
        frame #16: 0x000000010da1818b test-periodic`main(argc=1,     argv=0x00007fff521e88b8) + 43 at test-periodic.cc:57
        frame #17: 0x00007fff9557b5c9 libdyld.dylib`start + 1
        frame #18: 0x00007fff9557b5c9 libdyld.dylib`start + 1
    
      thread #2: tid = 0x0001, 0x00007fff8ab404cd libsystem_pthread.    dylib`_pthread_mutex_lock + 23, stop reason = signal SIGSTOP
        frame #0: 0x00007fff8ab404cd libsystem_pthread.dylib`_pthread_mutex_lock + 23
        frame #1: 0x000000010da1c8d5 test-    periodic`boost::asio::detail::posix_mutex::lock(this=0x0000000000000030) + 21     at posix_mutex.hpp:52
        frame #2: 0x000000010da1c883 test-periodic`boost::asio::detail::scoped_lock<    boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38,     m=0x0000000000000030) + 51 at scoped_lock.hpp:46
        frame #3: 0x000000010da1c79d test-periodic`boost::asio::detail::scoped_lock<    boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38,     m=0x0000000000000030) + 29 at scoped_lock.hpp:45
        frame #4: 0x000000010da227a7 test-    periodic`boost::asio::detail::kqueue_reactor::run(this=0x0000000000000000,     block=true, ops=0x000000010e4fbda8) + 103 at kqueue_reactor.ipp:355
        frame #5: 0x000000010da2223c test-    periodic`boost::asio::detail::task_io_service::do_run_one(    this=0x00007fe733421900, lock=0x000000010e4fbd60,     this_thread=0x000000010e4fbd98, ec=0x000000010e4fbe58) + 348 at task_io_service    .ipp:368
        frame #6: 0x000000010da21e9f test-    periodic`boost::asio::detail::task_io_service::run(this=0x00007fe733421900,     ec=0x000000010e4fbe58) + 303 at task_io_service.ipp:153
        frame #7: 0x000000010da21d51 test-periodic`boost::asio::io_service::run(    this=0x00007fff521e8338) + 49 at io_service.ipp:59
        frame #8: 0x000000010da184b8 test-    periodic`TestPeriodic_TestDestructionDifferentThread_Test::TestBody(    this=0x00007fe733421e28)::$_0::operator()() const + 24 at test-periodic.cc:41
        frame #9: 0x000000010da1846c test-periodic`boost::detail::thread_data<    TestPeriodic_TestDestructionDifferentThread_Test::TestBody()::$_0>::run(    this=0x00007fe733421c10) + 28 at thread.hpp:117
        frame #10: 0x000000010da8849c test-periodic`boost::(anonymous namespace)    ::thread_proxy(param=<unavailable>) + 124 at thread.cpp:164
        frame #11: 0x00007fff8ab4305a libsystem_pthread.dylib`_pthread_body + 131
        frame #12: 0x00007fff8ab42fd7 libsystem_pthread.dylib`_pthread_start + 176
        frame #13: 0x00007fff8ab403ed libsystem_pthread.dylib`thread_start + 13
    

    Apparently, thread 2 causes crash as it tries to lock mutex which is already destructed. However, I'm not using any mutexes, so this must be something internal to io_service. This might happen if io_service is still being used after its' destruction. Looking closely at my main() function I noticed that the thread t I created is left dangling, i.e. there is no join() call on it. Consequently, this sometimes creates a situation when io object is already destructed (at the end of main) but thread t still tries to use it.

    Thus, the problem was fixed by adding t.join() call at the end of main() function:

    void main()
    {
        boost::asio::io_service io;
        boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
        boost::thread t([&io](){
            io.run();
        });
        unsigned int workCounter = 0;
    
        {
            PeriodicTest p(io, 50);
            boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
            workCounter = p.workCounter_;
        }
        work.reset();
        t.join();
        //EXPECT_EQ(10, workCounter);
    }