Why can't you use offsetof on non-POD structures in C++?

I was researching how to get the memory offset of a member to a class in C++ and came across this on wikipedia:

In C++ code, you can not use offsetof to access members of structures or classes that are not Plain Old Data Structures.

I tried it out and it seems to work fine.

class Foo
{
private:
    int z;
    int func() {cout << "this is just filler" << endl; return 0;}

public: 
    int x;
    int y;
    Foo* f;

    bool returnTrue() { return false; }
};

int main()
{
    cout << offsetof(Foo, x)  << " " << offsetof(Foo, y) << " " << offsetof(Foo, f);
    return 0;
}

I got a few warnings, but it compiled and when run it gave reasonable output:

Laptop:test alex$ ./test
4 8 12

I think I'm either misunderstanding what a POD data structure is or I'm missing some other piece of the puzzle. I don't see what the problem is.

Solution

Short answer: offsetof is a feature that is only in the C++ standard for legacy C compatibility. Therefore it is basically restricted to the stuff than can be done in C. C++ supports only what it must for C compatibility.

As offsetof is basically a hack (implemented as macro) that relies on the simple memory-model supporting C, it would take a lot of freedom away from C++ compiler implementors how to organize class instance layout.

The effect is that offsetof will often work (depending on source code and compiler used) in C++ even where not backed by the standard - except where it doesn't. So you should be very careful with offsetof usage in C++, especially ~~since I do not know a single compiler that will generate a warning for non-POD use...~~ Modern GCC and Clang will emit a warning if offsetof is used outside the standard (-Winvalid-offsetof).

Edit: As you asked for example, the following might clarify the problem:

#include <iostream>
using namespace std;

struct A { int a; };
struct B : public virtual A   { int b; };
struct C : public virtual A   { int c; };
struct D : public B, public C { int d; };

#define offset_d(i,f)    (long(&(i)->f) - long(i))
#define offset_s(t,f)    offset_d((t*)1000, f)

#define dyn(inst,field) {\
    cout << "Dynamic offset of " #field " in " #inst ": "; \
    cout << offset_d(&i##inst, field) << endl; }

#define stat(type,field) {\
    cout << "Static offset of " #field " in " #type ": "; \
    cout.flush(); \
    cout << offset_s(type, field) << endl; }

int main() {
    A iA; B iB; C iC; D iD;
    dyn(A, a); dyn(B, a); dyn(C, a); dyn(D, a);
    stat(A, a); stat(B, a); stat(C, a); stat(D, a);
    return 0;
}

This will crash when trying to locate the field a inside type B statically, while it works when an instance is available. This is because of the virtual inheritance, where the location of the base class is stored into a lookup table.

While this is a contrived example, an implementation could use a lookup table also to find the public, protected and private sections of a class instance. Or make the lookup completely dynamic (use a hash table for fields), etc.

The standard just leaves all possibilities open by restricting offsetof to POD (IOW: no way to use a hash table for POD structs... :)

Just another note: I had to reimplement offsetof (here: offset_s) for this example as GCC actually errors out when I call offsetof for a field of a virtual base class.