It looks like in the following example the compiler assumes that the pointer to double passed to bar()
may alias the integer member a
:
struct A {
int a;
double b;
};
void bar(double*);
int foo() {
A a;
a.a = 9;
bar(&a.b);
return a.a;
}
clang generates:
stp x29, x30, [sp, #16]
add x29, sp, #16
mov w8, #9
str w8, [sp]
mov x8, sp # saving the variable
add x0, x8, #8
bl bar(double*)
ldr w0, [sp] # the reload
ldp x29, x30, [sp, #16]
add sp, sp, #32
ret
(godbolt)
This doesn't happen, obviously, if the integer variable is defined outside of the class (godbolt). So to me it seems like both gcc and clang assume that foo(double*)
can clobber A::a
through A::b
, which feels weird since double*
cannot alias int*
. Is the intuition correct and if so, why?
The compiler considers that bar
may do something like
void bar(double* d) {
*std::launder(
reinterpret_cast<int*>(
reinterpret_cast<unsigned char*>(d) - offsetof(A, b))) = 0;
}
which effectively changes the value of a.a
.
Now this isn't actually defined by the standard. There is currently an underspecification of what reinterpret_cast<unsigned char*>(d)
exactly should mean in the first place, but even if assuming that it results in a pointer into the object representation of *d
, then subtracting from it should probably be UB because it is effectively the same as subtracting from a pointer to the initial element of an array.
The intention of the standard as currently written seems to be that the compiler should be able to assume that a.a
is unreachable from d
. std::launder
has preconditions to that effect. However, there are also other language features that do not seem to properly integrate with the reachability requirements that std::launder
has (see previous questions of mine).
However, in practice this is a common pattern being used e.g. for the container_of
macro, so even if it was allowed for the compiler to assume that the call to bar
can't change a.a
, I would still expect compilers to not make that assumption in practice, as it would break these traditional methods. (At least for standard-layout classes I would say.)
As far as I know (might be wrong) this container_of
approach is also well-defined in C, so even if C++ permitted the optimization, the need to be compatible with mixed C code would probably restrict its use, again at least for sufficiently POD types for which the compatibility with C is intended.
In practice that means that any pointer into any member/base of a class type object escaping means that the whole class object may be modified.