c++vtablevptrvirtual-table

virtual table and _vptr storage scheme


Can someone explains how this virtual table for the different class is stored in memory? When we call a function using pointer how do they make a call to function using address location? Can we get these virtual table memory allocation size using a class pointer? I want to see how many memory blocks is used by a virtual table for a class. How can I see it?

class Base
{
public:
    FunctionPointer *__vptr;
    virtual void function1() {};
    virtual void function2() {};
};

class D1: public Base
{
public:
    virtual void function1() {};
};

class D2: public Base
{
public:
    virtual void function2() {};
};
int main()
{
    D1 d1;
    Base *dPtr = &d1;
    dPtr->function1();
}

Thanks! in advance


Solution

  • The first point to keep in mind is a disclaimer: none of this is actually guaranteed by the standard. The standard says what the code needs to look like and how it should work, but doesn't actually specify exactly how the compiler needs to make that happen.

    That said, essentially all C++ compilers work quite similarly in this respect.

    So, let's start with non-virtual functions. They come in two classes: static and non-static.

    The simpler of the two are static member functions. A static member function is almost like a global function that's a friend of the class, except that it also needs the class`s name as a prefix to the function name.

    Non-static member functions are a little more complex. They're still normal functions that are called directly--but they're passed a hidden pointer to the instance of the object on which they were called. Inside the function, you can use the keyword this to refer to that instance data. So, when you call something like a.func(b);, the code that's generated is pretty similar to code you'd get for func(a, b);

    Now let's consider virtual functions. Here's where we get into vtables and vtable pointers. We have enough indirection going on that it's probably best to draw some diagrams to see how it's all laid out. Here's pretty much the simplest case: one instance of one class with two virtual functions:

    enter image description here

    So, the object contains its data and a pointer to the vtable. The vtable contains a pointer to each virtual function defined by that class. It may not be immediately apparent, however, why we need so much indirection. To understand that, let's look at the next (ever so slightly) more complex case: two instances of that class:

    enter image description here

    Note how each instance of the class has its own data, but they both share the same vtable and the same code--and if we had more instances, they'd still all share the one vtable among all the instances of the same class.

    Now, let's consider derivation/inheritance. As an example, let's rename our existing class to "Base", and add a derived class. Since I'm feeling imaginative, I'll name it "Derived". As above, the base class defines two virtual functions. The derived class overrides one (but not the other) of those:

    enter image description here

    Of course, we can combine the two, having multiple instances of each of the base and/or derived class:

    enter image description here

    Now let's delve into that in a little more detail. The interesting thing about derivation is that we can pass a pointer/reference to an object of the derived class to a function written to receive a pointer/reference to the base class, and it still works--but if you invoke a virtual function, you get the version for the actual class, not the base class. So, how does that work? How can we treat an instance of the derived class as if it were an instance of the base class, and still have it work? To do it, each derived object has a "base class subobject". For example, lets consider code like this:

    struct simple_base { 
        int a;
    };
    
    struct simple_derived : public simple_base {
        int b;
    };
    

    In this case, when you create an instance of simple_derived, you get an object containing two ints: a and b. The a (base class part) is at the beginning of the object in memory, and the b (derived class part) follows that. So, if you pass the address of the object to a function expecting an instance of the base class, it uses on the part(s) that exist in the base class, which the compiler places at the same offsets in the object as they'd be in an object of the base class, so the function can manipulate them without even knowing that it's dealing with an object of the derived class. Likewise, if you invoke a virtual function all it needs to know is the location of the vtable pointer. As far as it cares, something like Base::func1 basically just means it follows the vtable pointer, then uses a pointer to a function at some specified offset from there (e.g., the fourth function pointer).

    At least for now, I'm going to ignore multiple inheritance. It adds quite a bit of complexity to the picture (especially when virtual inheritance gets involved) and you haven't mentioned it at all, so I doubt you really care.

    As to accessing any of this, or using in any way other than simply calling virtual functions: you may be able to come up with something for a specific compiler--but don't expect it to be portable at all. Although things like debuggers often need to look at such stuff, the code involved tends to be quite fragile and compiler-specific.