I know this isn't a new issue, but I got confused after reading about c++11 memory fences;
If I have one reader thread and one writer thread.
Can I use an ordinary int
?
int x = 0; // global
writer reader
x = 1; printf("%d\n", x);
Is this behavior is undefined?
May I get an undefined value in the reader thread?
Or it's like using an std::atomic_uint_fast32_t
or std::atomic<int>
? So the value will get to the reader thread - eventually.
std::atomic<int x = 0; // global
writer reader
x.store(1, std::memory_order_relaxed); printf("%d\n", x.load(std::memory_order_relaxed));
Does the answer depends on the platform I'm using? (x86 for example), so loading/storing an ordinary int
is one CPU instruction?
If both of the behaviors are similar, should I expect the same performance for both of the types?
In short, never use a plain int
to share in a multi-threaded environment.
The problem is not only your CPU, but your compiler's optimiser. gcc can (and will) optimise code like:
while(i == 1) {}
into if(i==1) { while(1) {} }
. Once it has checked the variable once, it does not have to reload the value again. that's separate from all the other possible issues, of seeing half-written values (which actually won't usually occur on x86 ints).
measuring the effect of atomic
is very hard -- in many cases CPUs can highly optimise the accesses, in others they are much slower. You really have to benchmark in practice.