c++language-lawyerc++20

Is reinterpret_cast between unrelated types allowed if they are standard layout in C++20


There have already been many questions on casts, but since C++20 there are a few additional constructs that relate to type casts and hence this question. Consider the following class structure.

struct A {
  int index;
}

struct B: A {}

static_assert(std::is_standard_layout_v<A>, "test");
static_assert(std::is_standard_layout_v<B>, "test");

Both types are standard-layout, which we can verify via a static_assert. For these according to the standard following holds:

Two objects a and b are pointer-interconvertible if:

  • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object (11.4), or

Meaning that the following code should be allowed:

A a = {};
int* i = reinterpret_cast<int*>(&a);
B* b = reinterpret_cast<B*>(i);

However, my question is if it is actually allowed to use b for anything. In particular, it seems to me that type aliasing rules forbid accessing (or writing) lvalue b per the following statement.

If a program attempts to read or modify the stored value of an object through a lvalue(until C++11)glvalue(since C++11) through which it is not type-accessible, the behavior is undefined.

The rules for type-accessible are not entirely clear to me after reading the description it seems that the dynamic type of b is A. Since this relates to typed based aliasing rules, so a follow up question would be if -fno-strict-aliasing would allow such conversions.

If people are wondering about the context. It is from a legacy code base where A has some accessors and B (and many other derived types) are essentially different views on this underlying object, and it is sometimes necessary to convert between these types (and in the actual implementation this is expensive). At the time people though that since alignment and size are the same, that such casts would be sound and cheaper than explicit constructions. However, in general, this does not seem to be case. Unfortunately I could not find any compiler flag or sanitizer that actually shows that this is UB, or was able to construct an example where this goes wrong.


Solution

  • It is important to keep in mind that a pointer value points at an object, and that object has a type, and the type is not the same as the pointer's type.

    We can work through the meaning of the operations as follows:

    int* i = reinterpret_cast<int*>(&a);
    

    We look to [expr.reinterpret.cast]:

    When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_­cast<cv T*>(static_­cast<cv void*>(v)).

    static_cast<void*>(&a) is defined by [expr.static.cast] and ultimately [conv.ptr]:

    A prvalue of type “pointer to cv T”, where T is an object type, can be converted to a prvalue of type “pointer to cv void”. The pointer value ([basic.compound]) is unchanged by this conversion.

    Therefore, static_cast<void*>(&a) is a prvalue of type "pointer to void" with the same pointer value: pointing at the object a of type A.

    Then to determine the value of i, we have to look at static_cast<int*>(p) where p is that pointer. That is back to [expr.static.cast]:

    if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b

    We have a pointer to the object a (of type pointer to void) and there is an object b of type int which is pointer-interconvertible, so we determine that i is a pointer to that int subobject a.index.

    We can apply the same analysis to the next line:

    B* b = reinterpret_cast<B*>(i);
    

    Again, the reinterpret_cast is equivalent to a static_cast through void*, and the first cast does not change the value. The second cast however falls into the next sentence:

    Otherwise, the pointer value is unchanged by the conversion.

    So b is a pointer of type B* which points at the object a.index of type int.

    What can you use b for? By the same logic as above, you can reinterpret_cast<int*>(b) and the pointer value will be unchanged and you will retrieve the pointer to a.index.

    What else can you do? Can you dereference and bind to a reference? [dcl.ref]

    A reference shall be initialized to refer to a valid object or function.

    Well that's not helpful! This is the subject of DR 453. That is the DR which adds the "type-accessible" language. But access only happens through scalar lvalues.