c++structsizeofabi

Why does the size of a struct change depending on whether an initial value is used?


In the following code, both D1 & D2 have the same data members. However, the object size is different depending on whether the last member of the base class is initialized.

    #include <cstdint>
    
    struct B1 { int x; short y = 0; };
    struct D1 : B1 { short z; };
    struct B2 { int x; short y; };
    struct D2 : B2 { short z; };
    static_assert(sizeof(D1) == sizeof(D2));

GCC 14.2 (and Clang 18.0.1) fails with:

    <source>:7:26: error: static assertion failed
        7 | static_assert(sizeof(D1) == sizeof(D2));
          |               ~~~~~~~~~~~^~~~~~~~~~~~~
    <source>:7:26: note: the comparison reduces to '(8 == 12)'

What is the explanation for this behavior?


Solution

  • This is an arbitrary ABI choice for compatibility with C.

    In the Itanium C++ ABI, as implemented by GCC and Clang here, B2 is considered POD for the purpose of layout while B1 is not. (It is reasonable because B2's definition is valid C as well, while B1's definition is not.)

    The ABI specifies that tail padding of a type will be reused only if it is not POD for the purpose of layout. This assures that e.g. memcpy/memset as typically used in C is safe on the base class subobject and won't overwrite any derived class members. Also, in C the compiler is always allowed to overwrite padding if a member is modified and the C compiler is not aware of tail padding reuse. So even without calls to memcpy, etc. using the base class subobject in C would otherwise not be compatible with C++.

    Both B1 and B2 have two bytes tail padding. D1 will reuse it to fit the z member, while D2 will not and leave the two padding bytes empty instead. Because the size of the whole class must be a multiple of its alignment which is 4 because of the int member, D2 then also needs to add two additional new tail padding bytes.


    Or to be more precise:

    It seems to me that the Itanium ABI does not clearly specify whether the default member initializer makes the type not POD for the purpose of layout. It uses the C++03 definition of POD which obviously is not aware of default member initializers which were introduced with C++11. However, it seems that GCC and Clang at least agreed that a default member initializer should make the class not POD for the purpose of layout and generally it makes sense that later language extensions of C++ should not need to worry much about C compatibility. (One could argue either direction, but at least a definition with a default member initializer can't be used in C unmodified.)


    Also important:

    GCC did not behave in this way for a while in C++14 or later mode. Before GCC 12, it did consider both types to be POD for the purpose of layout when compiling for C++14 or later. See the breaking ABI change version 17 at https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html. Clang seems to always have behaved as both do now.


    Also note that this really is just an implementation choice (although one relevant to ABI and C compatibility). According to the C++ standard, using functions like memcpy/memset/etc. on a base class subobject has always undefined behavior, regardless of whether the types are POD/trivial in any capacity.