Considering the following code:
std::atomic<int> counter;
/* otherStuff 1 */
counter.fetch_add(1, std::memory_order_relaxed);
/* otherStuff 2 */
Is there an instruction in x86-64 (say less than 5 years old architectures) that would allow otherStuff 1 and 2 be re-ordered across the fetch_add
or is it going to be always serializing ?
EDIT:
It looks like this is summarized by "is lock add
a memory barrier on x86 ?" and it seems it is not, though I am not sure where to find a reference for that.
First let's look at what the compiler is allowed to do when using std::memory_order_relaxed
.
If there are no dependencies between otherStuff 1/2
and the atomic operation, it can certainly reorder the statements. For example:
g = 3;
a.fetch_add(1, memory_order_relaxed);
g += 12;
clang++ generates the following assembly:
lock addl $0x1,0x2009f5(%rip) # 0x601040 <a>
movl $0xf,0x2009e7(%rip) # 0x60103c <g>
Here clang took the liberty to reorder g = 3
with the atomic fetch_add
operation, which is a legitimate transformation.
When using std::memory_order_seq_cst
, the compiler output becomes:
movl $0x3,0x2009f2(%rip) # 0x60103c <g>
lock addl $0x1,0x2009eb(%rip) # 0x601040 <a>
addl $0xc,0x2009e0(%rip) # 0x60103c <g>
Reordering of statements does not take place because the compiler is not allowed to do that. Sequential consistent ordering on a read-modify-write (RMW) operation, is both a release and an acquire operation and as such, no (visible) reordering of statements is allowed on both compiler and CPU level.
Your question is whether, on X86-64
, std::atomic::fetch_add
, using relaxed ordering, is a serializing operation..
The answer is: yes, if you do not take into account compiler reordering.
On the X86
architecture, an RMW operation always flushes the store buffer and therefore is effectively a serializing and sequentially consistent operation.
You can say that, on an X86
CPU, each RMW operation: