c++g++clang++

clang++ compiles unreachable function, g++ doesn't


I saw this meme on Instagram about some C++ code that should not output anything but does. The code is:

#include <iostream>

int main() {
    while (1)
        ;
}

void unreachable() {
    std::cout << "Hello World!" << std::endl;
}

c++ meme

I compiled it with clang as shown in the meme and got the same result (Ubuntu clang version 14.0.0-1ubuntu1.1) but the same code compiled with gcc does what you expect: nothing (g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0).

I would like to know why clang does things different and how the hack the unreachable function is executed if I never call it from the main function.


Solution

  • This is a well-known instance of (unusually?) aggressive optimization by Clang. You can find many lengthy discussions about it e.g. at https://github.com/llvm/llvm-project/issues/60622.

    The standard has certain forward-progress requirements on programs that the compiler is allowed to assume hold true.

    In particular, the compiler is allowed to assume that any thread eventually either terminates, calls a standard library IO function, performs a volatile access, a synchronization action or an atomic access.

    Your loop while (1); will cause the main thread to never do any of these things. Therefore the program has undefined behavior if this loop is reached in execution.

    Clang replaces the loop with an unreachable marker, as a valid program could never possibly reach the UB loop and as a result it will not emit any instructions for the body of main, not even ret, since it follows from the above that it is impossible that main is ever called if the program were valid.

    So the call to main will fall through to unreachable in the machine instructions.

    This behavior is permitted by the standard, with the exception that it will result in main and unreachable having the same function address, which in general could affect observable behavior in ways that it shouldn't. If Clang would add a single ud2 to trap into the function body when doing such optimizations for functions that always have UB when called, it would be fully conforming. See e.g. https://github.com/llvm/llvm-project/issues/60596.

    Also note that C has a similar rule with an exception if the controlling condition of the loop is a constant expression, which here is the case. So in C this program does not have undefined behavior.


    Update as of September 2024:

    [P2809R3] has been adopted recently. With it certain forms of infinite loops termed trivial infinite loops are exempt from the forward guarantees mentioned above. In particular while(1); is one of the forms which are trivial infinite loops. With this OP's code is not UB anymore.

    Clang will have this implemented in release 19 and OP's code will work as expected.