I’m debugging a segmentation fault in a legacy codebase. The only information I have is a stack trace from a core-dump text file; no base addresses are available, so I can’t map the fault to a source line.
But within the function, the only place that might cause it is equivalent to:
auto newInterface = reinterpret_cast<INew*>(ptr);
auto i_ptr = newInterface->queryInterface(someGuid);
The two interfaces are completely unrelated:
struct OldGuid { unsigned char value[16]; };
struct OldRef;
struct NewGuid { char value[16]; };
struct OldInterface {
virtual ~OldInterface() = default;
protected:
virtual OldRef* queryInterface(const OldGuid*) { // default impl
return nullptr;
}
public:
// NVI wrapper around the virtual function
template <typename T>
OldRef* queryInterface(); // ...
/* other (possibly virtual) members */
};
struct NewInterface {
virtual ~NewInterface() = default;
virtual void* queryInterface(const NewGuid&) = 0; // note: const ref, not pointer
/* other (possibly virtual) members */
};
This code has “worked for years”, but I believe it’s undefined behavior because reinterpret_cast
between types is only allowed if they are standard-layout, and having virtual functions disqualifies them from being standard layout.
Is this reasoning correct?
What if the casting would be performed via void*
and not in the same translation unit? Would the expectation of 'eqiuvalent virtual table offset' be valid in this case?
I found basically the same question 13 years prior. But none of the answers quote the Standard.
According to [expr.reinterpret.cast]/7, in this case reinterpret_cast
is defined via static_cast
:
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast<cv T*>(static_cast<cv void*>(v)).
static_cast
for this case is defined in [expr.static.cast]/13:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value ([basic.compound]) is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type similar to T that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
So your cast should work if the two objects are pointer-interconvertible, a property which is defined in [basic.compound]/5:
Two objects a and b are pointer-interconvertible if
they are the same object, or
one is a union object and the other is a non-static data member of that object ([class.union]), or
one is a standard-layout class object and the other is the first non-static data member of that object or any base class subobject of that object ([class.mem]), or
there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast ([expr.reinterpret.cast]).
As you can see, none of these cases fit your code, so "the pointer value is unchanged by the conversion", which means the resulting pointer points to an object of type OldInterface
and dereferencing it is undefined (but the cast itself is valid, you can cast the result back to OldInterface*
and dereference that if you want).
That said, in practice this will work fine on any sane ABI as long as you make sure to keep the virtual method declaration order consistent (which you do). Since there aren't any optimization opportunities here, there isn't really any reason for the compiler to suddenly break this code. It is very likely that the issue is actually elsewhere.