In the following code, both D1
& D2
have the same data members. However, the object size is different depending on whether the last member of the base class is initialized.
#include <cstdint>
struct B1 { int x; short y = 0; };
struct D1 : B1 { short z; };
struct B2 { int x; short y; };
struct D2 : B2 { short z; };
static_assert(sizeof(D1) == sizeof(D2));
GCC 14.2 (and Clang 18.0.1) fails with:
<source>:7:26: error: static assertion failed
7 | static_assert(sizeof(D1) == sizeof(D2));
| ~~~~~~~~~~~^~~~~~~~~~~~~
<source>:7:26: note: the comparison reduces to '(8 == 12)'
What is the explanation for this behavior?
This is an arbitrary ABI choice for compatibility with C.
In the Itanium C++ ABI, as implemented by GCC and Clang here, B2
is considered POD for the purpose of layout while B1
is not. (It is reasonable because B2
's definition is valid C as well, while B1
's definition is not.)
The ABI specifies that tail padding of a type will be reused only if it is not POD for the purpose of layout. This assures that e.g. memcpy
/memset
as typically used in C is safe on the base class subobject and won't overwrite any derived class members. Also, in C the compiler is always allowed to overwrite padding if a member is modified and the C compiler is not aware of tail padding reuse. So even without calls to memcpy
, etc. using the base class subobject in C would otherwise not be compatible with C++.
Both B1
and B2
have two bytes tail padding. D1
will reuse it to fit the z
member, while D2
will not and leave the two padding bytes empty instead. Because the size of the whole class must be a multiple of its alignment which is 4
because of the int
member, D2
then also needs to add two additional new tail padding bytes.
Or to be more precise:
It seems to me that the Itanium ABI does not clearly specify whether the default member initializer makes the type not POD for the purpose of layout. It uses the C++03 definition of POD which obviously is not aware of default member initializers which were introduced with C++11. However, it seems that GCC and Clang at least agreed that a default member initializer should make the class not POD for the purpose of layout and generally it makes sense that later language extensions of C++ should not need to worry much about C compatibility. (One could argue either direction, but at least a definition with a default member initializer can't be used in C unmodified.)
Also important:
GCC did not behave in this way for a while in C++14 or later mode. Before GCC 12, it did consider both types to be POD for the purpose of layout when compiling for C++14 or later. See the breaking ABI change version 17 at https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html. Clang seems to always have behaved as both do now.
Also note that this really is just an implementation choice (although one relevant to ABI and C compatibility). According to the C++ standard, using functions like memcpy
/memset
/etc. on a base class subobject has always undefined behavior, regardless of whether the types are POD/trivial in any capacity.