rustenumsabimemory-layout

How to pack a Rust enum into its minimal size?


I have a Rust enum with some data. I want to pack it into as few bytes as possible. I tried using repr like this:

#[repr(u8)]
enum MyEnum {
    OptionA(u32),
    OptionB(u32),
    Nothing,
}

fn main() {
    println!("{}", std::mem::size_of::<MyEnum>()); // prints 8 (should be 5)
}

In theory, this should only need to take up 5 bytes (1 for a u8 discriminant and 4 for the u32s). But, regardless of what repr I use, it takes up the full 8 bytes, as if it was aligned to 4.

The official Rust docs make it clear that repr(u*) does what you expect for fieldless enums, but the section on enums with fields is ambiguous to me:

If the enum has fields, the effect is similar to the effect of repr(C) in that there is a defined layout of the type. This makes it possible to pass the enum to C code, or access the type's raw representation and directly manipulate its tag and fields.

So the layout is defined, but does the argument to repr just not do anything? Or is this a bug? This seems insane to me. I understand that it is often more performant to align fields, but what is the point of letting you specify a repr if it does nothing? If I want memory-efficient packing to I have to implement it myself and lose all of rusts pattern matching and safety guarantees?


Solution

  • In general, this type has to be 8 bytes. That's because it contains a u32, which is 4 bytes and must be aligned to a 4-byte address, and if you have an array, then the elements of the array must each be aligned to a multiple of 4 bytes, which requires that the size be a multiple of 4. Otherwise, you could take a mut reference to the object which would not be aligned, and references are not allowed to be unaligned.

    Unaligned access is always slower, and on some architectures it also kills the process with a SIGBUS. Some architectures that would normally have your process killed can have the kernel fix up the access at the enormous cost of a trap, a context switch into the kernel, two loads and some shifts, and then a context switch out of the kernel. Usually people on those architectures prefer the SIGBUS instead because then at least the problem is obvious. Even RISC-V, one of the newest architectures, doesn't guarantee fast unaligned access (it may trap into the kernel).

    Note that the C compiler does the same thing:

    #include <stdio.h>
    #include <inttypes.h>
    
    struct foo {
        uint8_t tag;
        uint32_t value;
    };
    
    int main(void)
    {
        printf("%zu\n", sizeof(struct foo));
    }
    

    That prints 8. It is true that some C compilers offer packed representations, but they are nonstandard.

    There has been some discussion of packed enums in Rust, but they have not been standardized yet, and so are not available.