cunionstrict-aliasingtype-punning

Is it allowed to use unions to modify parts of an object?


According to GCC's documentation and the various answers I read on Stack Overflow, it is allowed to use unions for type punning in C, like:

union a_union {
  int i;
  double d;
};

int f() {
  union a_union t;
  t.d = 3.0;
  return t.i;
}

But is it allowed to use unions to modify parts of an object? like:

#include <stdint.h>
#include <assert.h>

union test {
    uint32_t val;
    uint16_t part[2];
};

int main(void)
{
    union test t;

    t.val = 0x12345678;
    t.part[0] = 0x4321;
    
    assert(t.val == 0x12344321);

    return 0;
}

Assuming our machine is little-endian, will the assert always succeed?

C standard says:

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

But I do see similar usage in many places, such as the Linux kernel.

According to this answer, since sizeof(uint16_t [2]) == sizeof(uint32_t), this does not violate the rule above.

So, I modifiy my question:

#include <stdint.h>
#include <assert.h>

union test {
    uint32_t val;
    uint8_t bit_0;
    struct {
        uint8_t _bit_0;
        uint8_t bit_8;
    };
};

int main(void)
{
    union test t;

    t.val = 0x12345678;
    t.bit_0 = 0x21;
    t.bit_8 = 0x43;
    
    assert(t.val == 0x12344321);

    return 0;
}

Is it still valid in this situation?


Solution

  • Union type-punning is OK so long as you don’t read a member in a way that would produce a trap representation for that member’s type. (Unsigned integer types don’t have trap representations; some signed/FP types can.)

    In pre-C23, reading a different union member than the one last written is implementation-defined (GCC/Clang document it as supported). In C23 it’s explicitly allowed to access a union object through any of its members.

    The rule you quoted (“bytes … that do not correspond to that member … take unspecified values”) doesn’t hurt your second example, because both members cover the same 4 bytes. When you store to t.part[0], you’re still writing within the union member part (the whole array), so there are no “other” bytes of the union that don’t correspond to that member.

    So both of your examples are OK.

    Assuming our machine is little-endian, will the assert always succeed?

    Yes

    Union punning is in common use. It’s a well-worn idiom in systems code (e.g., the Linux kernel, embedded firmware, protocol parsers etc)

    Your last example:

    
    union test {
        uint32_t val;
        uint8_t bit_0;
        struct {
            uint8_t _bit_0;
            uint8_t bit_8;
        };
    };
    
    int main(void)
    {
        union test t;
    
        t.val = 0x12345678;
        t.bit_0 = 0x21;
        t.bit_8 = 0x43;
        
        assert(t.val == 0x12344321);
    
        return 0;
    }
    

    What will happen on the compilers you actually use (GCC/Clang, little-endian):

    Why this “works” in practice

    Why it isn’t strictly portable

    It’s also endianness-dependent (big-endian would not match your expectation).