I struggle with a bug since hours now. Basically, I do some simple bit operation on an uint64_t array in main.c (no function calls). It works properly on gcc (Ubuntu), MSVS2019 (Windows 10) in Debug, but not in Release. However my target architecture is x64/Windows, so I need to get it work properly with MSVS2019/Release. Besides that, I'm curious what the reason for the problem is. None of the compilers shows errors or warnings.
Now, as soon as I add a totally unrelated command to the loop (commented printf()
), it works properly.
...
int q = 5;
uint64_t a[32] = { 0 };
// a[] is filled with data
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 2) | 8;
// printf("%i \n", i); // that's the line which makes it work
}
...
Initially I believed that I messed up the stack somewhere before the for()
loop, but I checked it up multiple times ... all fine!
All Google/SE posts explain subject UB to some of the above reasons, but none of these apply for my code. Also the fact, that it works in MSVS2019/Debug and gcc shows the code works.
What do I miss?
--- UPDATE (24.08.2021 12:00) ---
I'm completely stuck, since added printf()
modifies the result and MSVS/Debug works. So how can I inspect variables?!
@Lev M There are quite some calculations before and after the shown for()
loop. That's why I skipped most of the code and just showed the snippet where I could influence the code towards working correctly. I know what should be the final result (it's just a uint64_t), and it's wrong with the Release version of MSVS. I also checked w/o the for()
loop. It's not optimized "away". If I leave it out completely, the result is again different.
@tstanisl It's just a matter of an uint64_t number. I know that input A should output B.
@Steve Summit That's why I posted (a bit desperate). I checked in all directions, isolated as much code as I could and yet ... no uninitialized variable or array out of bound. Driving me nuts.
@Craig Estey The code is unfortunately quite extensive. I wonder ... could the error also be in a part of the code which doesn't run?
@Eric Postpischil Agreed!
@Nate Eldredge I tested on valgrind (see below).
...
==13997== HEAP SUMMARY:
==13997== in use at exit: 0 bytes in 0 blocks
==13997== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==13997==
==13997== All heap blocks were freed -- no leaks are possible
==13997==
==13997== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
--- UPDATE (24.08.2021 18:00) ---
I found the reason for the problem (after countless trial-and-errors), but no solution yet. I post more of the code.
...
int q = 5;
uint64_t a[32] = { 0 };
// a[] is filled with data
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 2) | 8;
// printf("%i \n", i); // that's the line which makes it work
}
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 3) | 3;
}
...
In fact, the MSVS/Release compiler did this:
...
int q = 5;
uint64_t a[32] = { 0 };
// a[] is filled with data
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 2) | 8;
a[q] = (a[q] << 3) | 3;
}
...
... which is not the same. Never seen such a thing!
How can I force the compiler to keep the 2 for()
loops separate?
Summary:
MSVS/Release (default solution properties) optimization will change this code ...
// Code 1
...
int q = 5;
uint64_t a[32];
// a[] is filled with data
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 2) | 8;
// printf("%i \n", i); // that's the line which makes it work
}
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 3) | 3;
}
...
... into the following one, which is not the same as ...
// Code 2
...
int q = 5;
uint64_t a[32];
// a[] is filled with data
for (int i = 0; i < 32; i++) {
a[q] = (a[q] << 2) | 8;
a[q] = (a[q] << 3) | 3;
}
...
Above excerpt is slightly simplified, since not limited to constant 32 loops, but kept variable (% 8). Hence 64-bit constants can't be used as commented by a user.
Discoveries:
MSVS/Release - fails
MSVS/Debug - works
gcc/Release - works
gcc/Debug - works
MSVS/Release optimization merges the two for()
loops (Code 1) into one for()
loop (Code 2).
Fixes:
The commented printf()
provides an artificial fix this as the compiler sees the requirement to print an intermediate result.
An alternative fix would be to to use the type qualifier volatile
for a[]
.
The root of the issue is, that MSVS optimization doesn't consider that the index q
remains the same in both loops, meaning that the first loop needs to finish before the second loop starts.