c++undefined-behavior

Why does the compiler optimize `g` to always return true, even though `i == INT_MAX` avoids undefined behavior?


In this example from CppCon:

bool f(int i) { return i + 1 > i; }

bool g(int i) {
  if (i == INT_MAX) return false;
  else return f(i);
}

The compiler optimizes f to always return true because signed integer overflow is undefined behavior (UB). However, I don't understand why it would optimize g to always return true, even when i == INT_MAX.

For i == INT_MAX, no overflow occurs in g, so UB should never happen. Why does the compiler still change the behavior of g?

I tried this on Compiler Explorer with -O3 on x86-64 Clang 19.1.0, and this was the output:

f(int):
        mov     al, 1
        ret

g(int):
        cmp     edi, 2147483647
        setne   al
        ret

As you can see, it doesn't optimize g to always return true — it properly checks for i == INT_MAX. Was the information presented in the video incorrect?

Here's a quote from the video explaining the behavior:

In particular it means when you call f, integer overflow does not happen, so i != INT_MAX. It will then, under that assumption, eliminate what is basically that code, because i != INT_MAX. So the first line of g is that code — it will never happen.

Here’s the link to the video for reference.


Solution

  • The presenter got it wrong. The compiler could assume i != INT_MAX if f was called unconditionally, but because g explicitly checks for the one input that would make f produce UB, and in that case doesn't make the call, this is fine.

    I think what he means is that in this case:

    bool f(int i) { return i + 1 > i; }
    
    int global;
    
    bool g(int i) {
      if (i == INT_MAX) 
          global = 42;
      return f(i);
    }
    

    a conforming compiler could delete the first part of the function. But compilers don't do this in practice, because it produces all sorts of surprising behaviour, that programmers don't want:

    // Infinite loop
    for (int i = 0; i < 4; ++i) {
        // Overflows if i >= 3, thus the compiler may assume i < 3 
        // and replace the i < 4 check with true
        std::cout << i * 1'000'000'000 << std::endl;
    }