While I was studying C++, I found something weird...
I though that the below code would produce the result of big number(At least not 1.1).
Instead the result was enter image description here
Other compilers worked as expected.
But the clang compiler with aggressive optimization seem to ignore the while loop.
So my question is, what's the problem with my code? Or is this intended by the clang?
I used the apple clang compiler(v14.0.3)
#include <iostream>
#include <thread>
static bool should_terminate = false;
void infinite_loop() {
long double i = 1.1;
while(!should_terminate)
i *= i;
std::cout << i;
}
int main() {
std::thread(infinite_loop).detach();
std::cout << "main thread";
for (int i = 0 ; i < 5; i++) {
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << ".";
}
should_terminate = true;
}
Assembly result from compiler explorer(clang v16.0.0, -O3)
This also seemed to skip the while loop.
_Z13infinite_loopv: # @_Z13infinite_loopv
sub rsp, 24
fld qword ptr [rip + .LCPI0_0]
fstp tbyte ptr [rsp]
mov rdi, qword ptr [rip + _ZSt4cout@GOTPCREL]
call _ZNSo9_M_insertIeEERSoT_@PLT
add rsp, 24
ret
Your code has undefined behaviour:
should_terminate
is not an atomic object, so writing to it in one thread and accessing it in another thread potentially concurrently (i.e. without any synchronization) is a data race, which is always undefined behaviour.
Practically speaking this UB rule permits the compiler to make exactly the optimization you see here.
The compiler can assume that should_terminate
will never change in the loop, because it cannot possibly be written to from another thread since that would be a data race. So when reaching the loop it is either false
and stays false
, so that the loop never terminates, or it is true
, in which case the loop body doesn't execute at all.
Then, because an infinite loop that doesn't perform any atomic/IO/volatile/synchronization operation would also have UB, the compiler can further deduce that should_terminate
must be (always) true
when the loop is reached. Consequently the loop body can never be executed and removing the loop is a permitted optimization.
So Clang is behaving correctly here and your expectations are wrong. should_terminate
must be a std::atomic<bool>
(or std::atomic_flag
) so that writing to it unsynchronized with other access it is not a data race.