c++pthreadsthread-sanitizer

ThreadSanitizer reports a data race despite mutex-protected access in pthread_cancel cleanup handler


I am encountering a puzzling issue with ThreadSanitizer while using pthread_cancel and a cleanup handler in a multithreaded C++ program. The sanitizer reports a data race on a global variable even though all accesses to the variable are protected by the same mutex. Below is a minimal reproducible example:

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

pthread_mutex_t mtx_test = PTHREAD_MUTEX_INITIALIZER;
int ga = 0;
void cleanup(void *arg)
{
    pthread_mutex_lock(&mtx_test);
    ga += 1;
    pthread_mutex_unlock(&mtx_test);
    printf("cleanup\n");
}

void* thr_fn(void * arg)
{
    pthread_cleanup_push(cleanup, arg);
    while(true)
    {
        pthread_mutex_lock(&mtx_test);
        ga += 1;
        pthread_mutex_unlock(&mtx_test);
        sleep(1);
    }
    pthread_cleanup_pop(1);
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, NULL, thr_fn, nullptr);

    sleep(3);
    pthread_cancel(tid);

    pthread_mutex_lock(&mtx_test);
    printf("ga: %d\n", ga);
    pthread_mutex_unlock(&mtx_test);

    pthread_join(tid, nullptr);
}

And here is the output reported by ThreadSanitizer:

==================
WARNING: ThreadSanitizer: data race (pid=29924)
  Write of size 4 at 0x5593bd535068 by thread T1:
    #0 cleanup(void*) /tmp/test/test.cpp:10 (a.out+0x139b) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)
    #1 __pthread_cleanup_class::~__pthread_cleanup_class() /usr/include/pthread.h:578 (a.out+0x16c9) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)
    #2 thr_fn(void*) /tmp/test/test.cpp:25 (a.out+0x1493) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)
    #3 thr_fn(void*) /tmp/test/test.cpp:23 (a.out+0x147e) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)

  Previous read of size 4 at 0x5593bd535068 by main thread (mutexes: write M0):
    #0 main /tmp/test/test.cpp:37 (a.out+0x153c) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:383 (libtsan.so.2+0x58691) (BuildId: 38097064631f7912bd33117a9c83d08b42e15571)
    #1 thr_fn(void*) /tmp/test/test.cpp:23 (a.out+0x147e) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)

  Location is global 'ga' of size 4 at 0x5593bd535068 (a.out+0x4068)

  Mutex M0 (0x5593bd535040) created at:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1341 (libtsan.so.2+0x59a13) (BuildId: 38097064631f7912bd33117a9c83d08b42e15571)
    #1 thr_fn(void*) /tmp/test/test.cpp:20 (a.out+0x1438) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)

  Thread T1 (tid=29926, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1022 (libtsan.so.2+0x5ac1a) (BuildId: 38097064631f7912bd33117a9c83d08b42e15571)
    #1 main /tmp/test/test.cpp:31 (a.out+0x14fc) (BuildId: ad9c32e6694e2996bfed96f0eb507bec0278f791)

SUMMARY: ThreadSanitizer: data race /tmp/test/test.cpp:10 in cleanup(void*)
==================

The global variable ga is only accessed while holding the mtx_test mutex. Both in the cleanup handler and in main, the pthread_mutex_lock is explicitly used before accessing ga, ensuring mutual exclusion. The warning points to a "data race" on ga, but this seems incorrect because the mutex guarantees thread-safe access.

Why does ThreadSanitizer report a data race in this case? Is this a known limitation of ThreadSanitizer when using pthread_cancel and cleanup handlers? If so, how can I address or suppress this false positive? Or is there an actual flaw in my code that I am overlooking?

Any help or insights would be greatly appreciated! Thank you!


Solution

  • As pointed out by someone else in the comments section, there appears to be a bug in ThreadSanitizer that when the function sleep is used as a cancellation point, ThreadSanitizer does not notice the use of thread synchronization during cleanup. Therefore, it appears that the error message reported by ThreadSanitizer is a false positive. As far as I can tell, your code is correct.

    As a workaround, you can use __attribute__((no_sanitize("thread"))) on your function cleanup, as described here in the ThreadSanitizer documentation. This will allow your function cleanup to access any variables without triggering an error message from ThreadSanitizer. I have successfully tested this workaround on both GCC and Clang. Alternatively, you can use the suppression or blacklist mechanism.