c++type-punning

reinterpret bytes of buffer according to system(native) endianess


I am receiving data from an another host via network socket in big endian format. How to interpret the bytes received in the native endian format(like get a view or reintrepret those bytes) without copying into a temporary variable.

#include <iostream>
#include <cstdint>

struct A {
    uint16_t msg_id;
    // contains many other fields where total size is greater than 4096

    void print() const {
        // print all the fields of struct A
    }
};

struct B {
    uint16_t msg_id;
    // contains many other fields where total size is greater than 4096

    void print() const {
        // print all the fields of struct B
    }
};

struct C {
    uint16_t msg_id;
    // contains many other fields where total size is greater than 4096

    void print() const {
        // print all the fields of struct C
    }
};

int main() {
    char buff[8192];
    while (true) {
        // data is received in network byte order (big endian) but my system is little endian
        const auto recvd_len = recvfrom(sock_fd, buff, sizeof(buff), 0, nullptr, nullptr);
        const uint16_t msg_id = (buff[0] << 8) | (buff[1] & 0xFF);

        switch (msg_id) {
            case 0x0001: {
                // reinterpret the bytes received as struct A, copy elision
                const A* a_obj = reinterpret_cast<const A*>(buff);
                a_obj->print();
                // the above print call works correctly only if my system is big endian but not little endian
            }

            break;

            case 0x0002: {
                // reinterpret the bytes received as struct B, copy elision
                const B* b_obj = reinterpret_cast<const B*>(buff);
                b_obj->print();
                // the above print call works correctly only if my system is big endian but not little endian
            }

            break;

            case 0x0003: {
                // reinterpret the bytes received as struct C, copy elision
                const C* c_obj = reinterpret_cast<const C*>(buff);
                c_obj->print();
                // the above print call works correctly only if my system is big endian but not little endian
            }

            break;

            default:
            break;
        }
    }
}

Solution

  • char buff[8192];
    // ...
    // reinterpret the bytes received as struct A, copy elision
    const A* a_obj = reinterpret_cast<const A*>(buff);
    a_obj->print();
    

    In C++, this is always undefined behavior. It is a strict aliasing violation because buff (after array-to-pointer conversion) is a pointer to a char object, but you are accessing its print member through the type A. That's not allowed; see [expr.ref] p9.

    buff is also possibly under-aligned, and you need to use alignas(A) to ensure proper alignment for the A object potentially inside.

    C++23 added a function which is perfect for this use case: std::start_lifetime_as:

    alingas(A) char buff[8192];
    // ...
    const A* a_obj = std::start_lifetime_as<A>(buff);
    a_obj->print();
    

    For this to work, A needs to be an implicit-lifetime class type, which looks possible in your case. Unfortunately, no compiler implements std::start_lifetime_as at the time of writing, so you'll need a workaround that's technically UB, but not obviously so:

    alingas(A) char buff[8192];
    // ...
    const A* a_obj = std::launder(reinterpret_cast<A*>(buff));
    a_obj->print();
    

    With std::launder, you're turning a pointer to a char object into a pointer to an A object at the same address. This is still undefined behavior if there isn't actually an A object there, but you can assume that recvfrom puts an A there. At least, the compiler cannot prove that recvfrom has not put an A there, so it will do what you want.

    Note on Endianness conversion

    Keep in mind that all of these approaches assume that the given object is already in native byte order, not in network byte order. No Endianness conversion is being performed here.

    To get meaningful values for your data members, you would need to correct their byte order after reinterpretation (no matter which approach), e.g. by using std::byteswap for each member.