According to this, a 64 bit load/store is considered to be an atomic access on arm64. Given this, is the following program still considered to have a data race (and thus can exhibit UB) when compiled for arm64 (ignore ordering with respect to other memory accesses)
uint64_t x;
// Thread 1
void f()
{
uint64_t a = x;
}
// Thread 2
void g()
{
x = 1;
}
If instead I switch this to using
std::atomic<uint64_t> x{};
// Thread 1
void f()
{
uint64_t a = x.load(std::memory_order_relaxed);
}
// Thread 2
void g()
{
x.store(1, std::memory_order_relaxed);
}
Is the second program considered data race free?
On arm64, it looks like the compiler ends up generating the same instruction for a normal 64 bit load/store and a load/store of an atomic with memory_order_relaxed
, so what's the difference?
Whether or not an access is a data race in the sense of the C++ language standard is independent of the underlying hardware. The language has its own memory model and even if a straight-forward compilation to the target architecture would be free of problems, the compiler may still optimize based on the assumption that the program is free of data races in the sense of the C++ memory model.
Accessing a non-atomic in two threads without synchronization with one of them being a write is always a data race in the C++ model. So yes, the first program has a data race and therefore undefined behavior.
In the second program the object is an atomic, so there cannot be a data race.