c++gcc compiler-optimization compiler-bug int128

Does GCC optimize array access with __int128 indexes incorrectly?

When compiling the following code using GCC 9.3.0 with O2 optimization enabled and running it on Ubuntu 20.04 LTS, x86_64 architecture, unexpected output occurs.

#include <algorithm>
#include <iostream>
int c[2];
void f(__int128 p) {
    c[p + 1] = 1;
    c[p + 1] = std::max(c[p + 1], c[p] + 1);
    return;
}
int main() {
    f(0);
    std::cout << c[1] << std::endl;
    return 0;
}

It is supposed to output 1, but actually it outputs 0. I reviewed the code and did not find anything wrong, such as undefined behavior. It seems that the compiler gives an incorrect program.

I changed the optimization option and found that using -O2, -O3, or -Ofast will incorrectly output 0, while using -O or -O0 will give the correct output of 1.

If I change __int128 to another type (__uint128_t, int, or any other integral type), it will output 1 correctly.

I added __attribute__((noinline)) to void f(__int128 p), which does not change the output, and then I checked the assembly on Godbolt, and found that if c[p] + 1 > 1, function f assigns c[p] + 1 to c[p + 1], otherwise it does nothing, which is inconsistent with the code semantics.

I tried other versions of gcc on Godbolt, and all versions of x86-64 gcc from 9.0 to 13.2 with -O2 enabled give the incorrect output, but older or newer gcc or other compilers such as clang give the correct one.

I checked the list of problem reports that are known to be fixed in the gcc 13.3 release, yet I could not identify any of them with it.

I wonder if it is due to the compiler not working correctly or if there is a problem with the code. It is really confusing.

I observed that if changing std::max(c[p + 1], c[p] + 1) to std::max(c[p + 1], 1), it fails until gcc 11.2, which differs from the previous case. It suggests that gcc 11.3 may have partly fixed the issue, and I am still uncertain whether gcc 13.3 has fully resolved it.

Solution

I ran a git bisect (with a slightly modified C testcase) and the bug was fixed in commit 7baefcb0:

The following fixes bogus truncation of a value-range for an int128 array index when computing the maximum extent for a variable array reference. Instead of possibly slowing things down by using widest_int the following makes sure the range bounds fit within the constraints offset_int were designed for.

PR middle-end/113396

tree-dfa.cc (get_ref_base_and_extent): Use index range bounds only if they fit within the address-range constraints of offset_int.

gcc.dg/torture/pr113396.c: New testcase.

That certainly sounds similar to your testcase. Of course, as a non-expert I can't say for sure if the bug was fixed properly, but at the very least it sounds like they were looking at the right issue.