cstructunions

Difference between a Structure and a Union


Is there any good example to give the difference between a struct and a union? Basically I know that struct uses all the memory of its member and union uses the largest members memory space. Is there any other OS level difference?


Solution

  • With a union, you're only supposed to use one of the elements, because they're all stored at the same spot. This makes it useful when you want to store something that could be one of several types. A struct, on the other hand, has a separate memory location for each of its elements and they all can be used at once.

    To give a concrete example of their use, I was working on a Scheme interpreter a little while ago and I was essentially overlaying the Scheme data types onto the C data types. This involved storing in a struct an enum indicating the type of value and a union to store that value.

    union foo {
      int a;   // can't use both a and b at once
      char b;
    } foo;
    
    struct bar {
      int a;   // can use both a and b simultaneously
      char b;
    } bar;
    
    union foo x;
    x.a = 3; // OK
    x.b = 'c'; // NO! this affects the value of x.a!
    
    struct bar y;
    y.a = 3; // OK
    y.b = 'c'; // OK
    

    edit: If you're wondering what setting x.b to 'c' changes the value of x.a to, technically speaking it's undefined. On most modern machines a char is 1 byte and an int is 4 bytes, so giving x.b the value 'c' also gives the first byte of x.a that same value:

    union foo x;
    x.a = 3;
    x.b = 'c';
    printf("%i, %i\n", x.a, x.b);
    

    prints

    99, 99
    

    Why are the two values the same? Because the last 3 bytes of the int 3 are all zero, so it's also read as 99. If we put in a larger number for x.a, you'll see that this is not always the case:

    union foo x;
    x.a = 387439;
    x.b = 'c';
    printf("%i, %i\n", x.a, x.b);
    

    prints

    387427, 99
    

    To get a closer look at the actual memory values, let's set and print out the values in hex:

    union foo x;
    x.a = 0xDEADBEEF;
    x.b = 0x22;
    printf("%x, %x\n", x.a, x.b);
    

    prints

    deadbe22, 22
    

    You can clearly see where the 0x22 overwrote the 0xEF.

    BUT

    In C, the order of bytes in an int are not defined. This program overwrote the 0xEF with 0x22 on my Mac, but there are other platforms where it would overwrite the 0xDE instead because the order of the bytes that make up the int were reversed. Therefore, when writing a program, you should never rely on the behavior of overwriting specific data in a union because it's not portable.

    For more reading on the ordering of bytes, check out endianness.