cgccclang

Can I access the first struct element without knowing what other elements are in it?


I suspect this is UB, according to ANSI, but on my system (x64 Linux), even with -Wall -Wextra -Werror -O3, both gcc and clang produce the expected result here:

#include <stdio.h>

typedef struct {
        int tag;
} base;

typedef struct {
        base b;
        int x;
} derived1;

typedef struct {
        base b;
        int x, y;
} derived2;

int sum(void* p) {
        base* b = (base*)p;

        if(b->tag) {
                derived2* d = (derived2*)b;
                return d->x + d->y;
        }
        else {
                derived1* d = (derived1*)b;
                return d->x;
        }
}

int main(void) {
        derived1 d1;
        d1.b.tag = 0;
        d1.x = 2;

        derived2 d2;
        d2.b.tag = 1;
        d2.x = 3;
        d2.y = 4;

        printf("%d %d\n", sum(&d1), sum(&d2));
}

Is this UB, according to ANSI?

Is it UB for gcc and clang also?

What if I use -fno-strict-aliasing or some other option?


Solution

  • The behavior is defined because C 2018 6.7.2.1 15 says “… A pointer to a structure object, suitably converted, points to its initial member…” Since sum is passed a pointer to a derived1 or a derived2, albeit converted to void *, and converts it to base *, which is the initial member of both a derived1 and a derived2, the result is a proper pointer to that initial member. (To be pedantic, the C standard does not defined “suitably converted,” but accepting this sequence of conversions as suitably converted is not controversial.)

    Then b->tag is an ordinary reference to the tag member of a base structure.

    The conversions of b to derived1 * or derived2 * are also defined since paragraph 15 continues “… and vice versa…” You could also convert to derived1 * or derived2 * directly from the void * p instead of from b.