C++ has the "common initial sequence" rule for unions:
In a standard-layout union with an active member (11.5) of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated.
How does this interact with nested structs? For instance, is it permissible to access u.s.s1.a1
, u.s.s1.a2
, u.s.s2.a1
, u.s.s2.a2
if u.b
is active? Or is the fact that S::s1
and B::b1
have different types imply that S
and B
do not have a common initial sequence?
struct A
{
int a1, a2;
};
struct S
{
A s1, s2;
};
struct B
{
int b1, b2, b3, b4;
};
union U
{
S s;
B b;
};
/******************/
U u;
u.s.s1.a1 = 1;
u.b.b1 == 1; // UB?
/******************/
The common initial sequence exceptions for inactive union member access is pretty specific. It effectively requires exactly identical non-static member definitions.
Specifically, [class.mem.general]/23 requires that, in declaration order, corresponding members have layout-compatible types.
That in turn is defined in [basic.types.general]/11 and can apply only if either both types are class types or neither is.
Since A
and int
are not both either class or non-class types, they are not layout-compatible and therefore the initial common sequence of S
and B
is empty.
So the access has UB.
It isn't even guaranteed for any but the first (indirect) int
members that they have the same offset from their containing class. A conforming implementation would be allowed to add padding after the first int
in A
, but not in B
.
Similarly, an implementation is allowed to choose alignof(A) != alignof(B)
.
Of course, neither of these is practical, so implementations don't do that.