In this example from CppCon:
bool f(int i) { return i + 1 > i; }
bool g(int i) {
if (i == INT_MAX) return false;
else return f(i);
}
The compiler optimizes f
to always return true
because signed integer overflow is undefined behavior (UB). However, I don't understand why it would optimize g
to always return true
, even when i == INT_MAX
.
For i == INT_MAX
, no overflow occurs in g
, so UB should never happen. Why does the compiler still change the behavior of g
?
I tried this on Compiler Explorer with -O3
on x86-64 Clang 19.1.0, and this was the output:
f(int):
mov al, 1
ret
g(int):
cmp edi, 2147483647
setne al
ret
As you can see, it doesn't optimize g
to always return true
— it properly checks for i == INT_MAX
. Was the information presented in the video incorrect?
Here's a quote from the video explaining the behavior:
In particular it means when you call
f
, integer overflow does not happen, soi != INT_MAX
. It will then, under that assumption, eliminate what is basically that code, becausei != INT_MAX
. So the first line ofg
is that code — it will never happen.
Here’s the link to the video for reference.
The presenter got it wrong. The compiler could assume i != INT_MAX
if f
was called unconditionally, but because g
explicitly checks for the one input that would make f
produce UB, and in that case doesn't make the call, this is fine.
I think what he means is that in this case:
bool f(int i) { return i + 1 > i; }
int global;
bool g(int i) {
if (i == INT_MAX)
global = 42;
return f(i);
}
a conforming compiler could delete the first part of the function. But compilers don't do this in practice, because it produces all sorts of surprising behaviour, that programmers don't want:
// Infinite loop
for (int i = 0; i < 4; ++i) {
// Overflows if i >= 3, thus the compiler may assume i < 3
// and replace the i < 4 check with true
std::cout << i * 1'000'000'000 << std::endl;
}