When compiling the following code using GCC 9.3.0 with O2 optimization enabled and running it on Ubuntu 20.04 LTS, x86_64 architecture, unexpected output occurs.
#include <algorithm>
#include <iostream>
int c[2];
void f(__int128 p) {
c[p + 1] = 1;
c[p + 1] = std::max(c[p + 1], c[p] + 1);
return;
}
int main() {
f(0);
std::cout << c[1] << std::endl;
return 0;
}
It is supposed to output 1, but actually it outputs 0. I reviewed the code and did not find anything wrong, such as undefined behavior. It seems that the compiler gives an incorrect program.
I changed the optimization option and found that using -O2
, -O3
, or -Ofast
will incorrectly output 0, while using -O
or -O0
will give the correct output of 1.
If I change __int128
to another type (__uint128_t
, int
, or any other integral type), it will output 1 correctly.
I added __attribute__((noinline))
to void f(__int128 p)
, which does not change the output, and then I checked the assembly on Godbolt, and found that if c[p] + 1 > 1
, function f
assigns c[p] + 1
to c[p + 1]
, otherwise it does nothing, which is inconsistent with the code semantics.
I tried other versions of gcc on Godbolt, and all versions of x86-64 gcc from 9.0 to 13.2 with -O2
enabled give the incorrect output, but older or newer gcc or other compilers such as clang give the correct one.
I checked the list of problem reports that are known to be fixed in the gcc 13.3 release, yet I could not identify any of them with it.
I wonder if it is due to the compiler not working correctly or if there is a problem with the code. It is really confusing.
I observed that if changing std::max(c[p + 1], c[p] + 1)
to std::max(c[p + 1], 1)
, it fails until gcc 11.2, which differs from the previous case. It suggests that gcc 11.3 may have partly fixed the issue, and I am still uncertain whether gcc 13.3 has fully resolved it.
I ran a git bisect
(with a slightly modified C testcase) and the bug was fixed in commit 7baefcb0:
The following fixes bogus truncation of a value-range for an int128 array index when computing the maximum extent for a variable array reference. Instead of possibly slowing things down by using widest_int the following makes sure the range bounds fit within the constraints offset_int were designed for.
tree-dfa.cc (get_ref_base_and_extent): Use index range bounds only if they fit within the address-range constraints of offset_int.
gcc.dg/torture/pr113396.c: New testcase.
That certainly sounds similar to your testcase. Of course, as a non-expert I can't say for sure if the bug was fixed properly, but at the very least it sounds like they were looking at the right issue.