cdata-structuresportability

Emulating a packed structure in portable C


I have the following structure:

typedef struct Octree {
    uint64_t *data;
    uint8_t alignas(8) alloc;
    uint8_t dataalloc;
    uint16_t size, datasize, node0;
    // Node8 is a union type with of size 16 omitted for brevity
    Node8 alignas(16) node[]; 
} Octree;

In order for the code that operates on this structure to work as intended, it is necessary that node0 immediately precedes the first node such that ((uint16_t *)Octree.node)[-1] will access Octree.node0. Each Node8 is essentially a union holding 8 uint16_t. With GCC I could force pack the structure with #pragma pack(push) and #pragma pack(pop). However this is non-portable. Another option is to:

This option is quite impractical. How else could I define this 'packed' data structure in a portable way? Are there any other ways?


Solution

  • The C language standard does not allow you to specify a struct's memory layout down to the last bit. Other languages do (Ada and Erlang come to mind), but C does not.

    So if you want actual portable standard C, you specify a C struct for your data, and convert from and to specific memory layout using pointers, probably composing from and decomposing into a lot of uint8_t values to avoid endianness issues. Writing such code is error prone, requires duplicating memory, and depending on your use case, it can be relatively expensive in both memory and processing.

    If you want direct access to a memory layout via a struct in C, you need to rely on compiler features which are not in the C language specification, and therefore are not "portable C".

    So the next best thing is to make your C code as portable as possible while at the same time preventing compilation of that code for incompatible platforms. You define the struct and provide platform/compiler specific code for each and every supported combination of platform and compiler, and the code using the struct can be the same on every platform/compiler.

    Now you need to make sure that it is impossible to accidentally compile for a platform/compiler where the memory layout is not exactly the one your code and your external interface require.

    Since C11, that is possible using static_assert, sizeof and offsetof.

    So something like the following should do the job if you can require C11 (I presume you can require C11 as you are using alignas which is not part of C99 but is part of C11). The "portable C" part here is you fixing the code for each platform/compiler where the compilation fails due to one of the static_assert declarations failing.

    #include <assert.h>
    #include <stdalign.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdio.h>
    
    typedef uint16_t Node8[8];
    
    typedef struct Octree {
        uint64_t *data;
        uint8_t alignas(8) alloc;
        uint8_t dataalloc;
        uint16_t size, datasize, node0;
        Node8 alignas(16) node[];
    } Octree;
    
    static_assert(0x10 == sizeof(Octree),              "Octree size error");
    static_assert(0x00 == offsetof(Octree, data),      "Octree data position error");
    static_assert(0x08 == offsetof(Octree, alloc),     "Octree alloc position error");
    static_assert(0x09 == offsetof(Octree, dataalloc), "Octree dataalloc position error");
    static_assert(0x0a == offsetof(Octree, size),      "Octree size position error");
    static_assert(0x0c == offsetof(Octree, datasize),  "Octree datasize position error");
    static_assert(0x0e == offsetof(Octree, node0),     "Octree node0 position error");
    static_assert(0x10 == offsetof(Octree, node),      "Octree node[] position error");
    

    The series of static_assert declarations could be written more concisely with less redundant source code typing for the error messages using a preprocessor macro stringifying the struct name, member name, and maybe size/offset value.

    Now that we have nailed down the struct member sizes and offsets within the struct, two aspects still need checks.