c++arraysstructintegerbit-fields

How to properly represent arrays of non full byte numbers in c++?


I wanted to implement the FAT12 specification in C++, where the FAT is an array of 12-Bit numbers.

Because types can only have full byte size I tried to use bitfields in structs to have a pair of two 12-Bit numbers which would fill 3 bytes:

struct TwoEntries {
    uint16_t first : 12;
    uint16_t second : 12;
};

But this struct has a size of four bytes because of padding explained in this question, and with the padding an array wouldn't properly fit the data.

So my question would be:
Is there some way to properly declare an array of 12-Bit numbers?


Solution

  • Technically there is a way, but it's not portable:

    #include <cstdint>
    
    struct [[gnu::packed]] TwoEntries {
        std::uint16_t first : 12;
        std::uint16_t second : 12;
    };
    
    static_assert(sizeof(TwoEntries) == 3); // assertion passes
    

    The size (in bytes) of bit-field members, the padding between them, and other properties are totally implementation-defined, so they make for a horrible tool when working with something like a file system where you must have a layout that is the same for all compilers.

    Instead, consider creating a class which has a layout that you have total control over:

    struct TwoEntries {
        std::uint8_t data[3];
    
        std::uint16_t get_first() const {
            return data[0] | ((data[1] & 0xf) << 8);
        }
    
        std::uint16_t get_second() const {
            return ((data[1] >> 4) & 0x0f) | (data[2] << 4);
        }
    
        void set_first(std::uint16_t x) {
            data[0] = x & 0xff;
            data[1] = (data[1] & 0xf0) | ((x >> 8) & 0x0f);
        }
    
        void set_second(std::uint16_t x) {
            data[1] = ((x & 0x0f) << 4) | (data[1] & 0xf);
            data[2] = (x >> 4) & 0xff;
        }
    };
    

    As you can see, this method is quite a bit more effort compared to using a bit-field, but we have total control over the memory layout, and our solution is portable across different compilers.

    If you run into this pattern a lot, it might make sense to create a class template such as:

    template <std::size_t BitWidth, std::size_t BitOffset>
    struct uint_bitref {
        void* to;
        uint_bitref(void* to) : to{to} {}
        /* ... */
    };
    
    // and then implement TwoEntries by returning this reference
    // which we can use to read and write an integer at a certain bit offset
    
    struct TwoEntries {
        using first_t = uint_bitref<12, 0>;
        using second_t = uint_bitref<12, 4>;
    
        std::uint8_t data[3];
        
        first_t first() {
            return data;
        }
    
        second_t get_second() {
            return data + 1;
        }
    };