I've stumbled along the following code:
#include <bitset>
#include <iostream>
int main() {
int x = 8;
void *w = &x;
bool val = *reinterpret_cast<const unsigned char*>(&x);
bool *z = static_cast<bool *>(w);
std::cout << "z (" << z << ") is " << *z << ": " << std::bitset<8>(*z) << "\n";
std::cout << "val is " << val << ": " << std::bitset<8>(val) << "\n";
}
With -O3, this produced output:
z (0x7ffcaef0dba4) is 8: 00001000
val is 1: 00000001
However, with -O0, this produced output:
z (0x7ffe8c6c914c) is 0: 00000000
val is 1: 00000001
I know that dereferencing z
invokes undefined behavior, and is why we are seeing inconsistent results. However, it seems dereferencing the reinterpret_cast
into val
is not invoking undefined behavior, and reliably produces {0,1} values.
Via (https://godbolt.org/z/f6s11Kr96), we see that gcc for x86 produces:
lea rax, [rbp-16]
movzx eax, BYTE PTR [rax]
test al, al
setne al
mov BYTE PTR [rbp-9], al
The effect of the test
setne
instructions is to convert non 0 values to 1 (and keep 0 values at 0). Is there some rule that states that reinterpret_cast
ing from void *
to const unsigned char *
should have this behavior?
Accessing (i.e. reading) the value of z
(not merely dereferencing itself) causes undefined behavior because it is an aliasing violation. (z
points to an object of type int
, but the access is through an lvalue of type bool
)
Access through a lvalue of type unsigned char
is specifically exempt from being an aliasing violation. (see [basic.lval]/11.3)
However, technically, it is still not specified what the result of accessing the int
object through a unsigned char
lvalue should be. The intent is that it gives the first byte of the object representation of the int
object, but the standard currently is defective in not specifying that behavior. The paper P1839 attempts to resolve this defect.
After reading this first byte from the object representation as a unsigned char
value you convert it implicitly to bool
when initializing bool val
from it. The conversion from unsigned char
to bool
is a conversion of values, not reinterpretation of object representation. It is specified that a zero value is converted to false
and anything else to true
. (see [conv.bool])
Whether you cast through void*
explicitly or directly cast the int*
to unsigned char*
or bool*
doesn't matter at all. reinterpret_cast
between pointers is actually specified to be equivalent to static_cast<void*>
followed by static_cast
to the target pointer type. (In your code static_cast
and reinterpret_cast
are interchangeable.)