[SOLVED] How to properly represent arrays of non full byte numbers in c++?

How to properly represent arrays of non full byte numbers in c++?

I wanted to implement the FAT12 specification in C++, where the FAT is an array of 12-Bit numbers.

Because types can only have full byte size I tried to use bitfields in structs to have a pair of two 12-Bit numbers which would fill 3 bytes:

struct TwoEntries {
    uint16_t first : 12;
    uint16_t second : 12;
};

But this struct has a size of four bytes because of padding explained in this question, and with the padding an array wouldn't properly fit the data.

So my question would be:
Is there some way to properly declare an array of 12-Bit numbers?

Solution

Technically there is a way, but it's not portable:

#include <cstdint>

struct [[gnu::packed]] TwoEntries {
    std::uint16_t first : 12;
    std::uint16_t second : 12;
};

static_assert(sizeof(TwoEntries) == 3); // assertion passes

The size (in bytes) of bit-field members, the padding between them, and other properties are totally implementation-defined, so they make for a horrible tool when working with something like a file system where you must have a layout that is the same for all compilers.

Instead, consider creating a class which has a layout that you have total control over:

struct TwoEntries {
    std::uint8_t data[3];

    std::uint16_t get_first() const {
        return data[0] | ((data[1] & 0xf) << 8);
    }

    std::uint16_t get_second() const {
        return ((data[1] >> 4) & 0x0f) | (data[2] << 4);
    }

    void set_first(std::uint16_t x) {
        data[0] = x & 0xff;
        data[1] = (data[1] & 0xf0) | ((x >> 8) & 0x0f);
    }

    void set_second(std::uint16_t x) {
        data[1] = ((x & 0x0f) << 4) | (data[1] & 0xf);
        data[2] = (x >> 4) & 0xff;
    }
};

As you can see, this method is quite a bit more effort compared to using a bit-field, but we have total control over the memory layout, and our solution is portable across different compilers.

If you run into this pattern a lot, it might make sense to create a class template such as:

template <std::size_t BitWidth, std::size_t BitOffset>
struct uint_bitref {
    void* to;
    uint_bitref(void* to) : to{to} {}
    /* ... */
};

// and then implement TwoEntries by returning this reference
// which we can use to read and write an integer at a certain bit offset

struct TwoEntries {
    using first_t = uint_bitref<12, 0>;
    using second_t = uint_bitref<12, 4>;

    std::uint8_t data[3];
    
    first_t first() {
        return data;
    }

    second_t get_second() {
        return data + 1;
    }
};