c++cpu-cachememory-modelstdatomicinstruction-reordering

Do we have the guarantee that any atomic write will store the new value of the atomic variable in the main memory immediately?


So, I was reading a lot about instruction and memory reordering and how we can prevent it, but i still have no answer to one qustion (probably because I'm not attentive enough). My question is: Do we have the guarantee that any atomic write will store the new value of the atomic variable in the main memory immediately? Let's take a look at a small example:

std::atomic<bool> x;
std::atomic<bool> y;
std::atomic<int> count;
void WritingValues()
{
   x.store(true, std::memory_order_relaxed);
   y.store(true, std::memory_order_relaxed);
}
void ReadValues()
{
   while( !y.load(std::memory_order_relaxed) );
   if( x.load(std::memory_order_relaxed) )
       ++count;
}
int main()
{
   x = false;
   y = false;
   count = 0;
   std::thread tA(WritingValues);
   std::thread tB(ReadValues);
   tA.join();
   tB.join();
   assert( count.load() != 0 );
}

So, here our assert can definitely fire, as we use std::memory_order_relaxed and do not prevent any instruction reordering (or memory reordering at compile time, i suppose that is the same thing). But if we place some compiler barrier in WritingValues to prevent instruction reordering, will everything be OK? I mean, does x.store(true, std::memory_order_relaxed) guarantees, that the write of that particular atomic variable will be directly into the memory, without any latency? Or does x.load(std::memory_order_relaxed) guarantess, that the value would be readed from the memory, not the cache with invalid value? In other words, this store guarantees only atomicity of the operation and have the same memory behaviour as usual non-atomic variable, or it also has influence on memory behaviour?


Solution

  •  I mean, does x.store(true, std::memory_order_relaxed) guarantees, that the  
     of that particular atomic variable will be directly into the memory,  
     without any latency?  
    

    No it doesn't and in fact given bool and memory order relaxed there is no 'invalid' value if you read it only once, both true and false are ok.
    Since relaxed memory order explicitly stays that no ordering is performed. Basically in your case it only means that after flipping from false to true, at some point it will become true for all other processes, but doesn't state 'whent' it will happen. So the only thing you can be sure here is that it won't become false again after becoming true. But there is no constraints on how long it will be false in another thread.
    Also it guarantees that you won't see any partially written variable in another thread, but that's hardly the case for the bools.
    You need to use aquire and release here. And even that won't give any guarantees about actual memory itself, only about program behavior, cache synchronization may do the trick even without bouncing data back and froth to the memory.