I am receiving data from an another host via network socket in big endian format. How to interpret the bytes received in the native endian format(like get a view or reintrepret those bytes) without copying into a temporary variable.
#include <iostream>
#include <cstdint>
struct A {
uint16_t msg_id;
// contains many other fields where total size is greater than 4096
void print() const {
// print all the fields of struct A
}
};
struct B {
uint16_t msg_id;
// contains many other fields where total size is greater than 4096
void print() const {
// print all the fields of struct B
}
};
struct C {
uint16_t msg_id;
// contains many other fields where total size is greater than 4096
void print() const {
// print all the fields of struct C
}
};
int main() {
char buff[8192];
while (true) {
// data is received in network byte order (big endian) but my system is little endian
const auto recvd_len = recvfrom(sock_fd, buff, sizeof(buff), 0, nullptr, nullptr);
const uint16_t msg_id = (buff[0] << 8) | (buff[1] & 0xFF);
switch (msg_id) {
case 0x0001: {
// reinterpret the bytes received as struct A, copy elision
const A* a_obj = reinterpret_cast<const A*>(buff);
a_obj->print();
// the above print call works correctly only if my system is big endian but not little endian
}
break;
case 0x0002: {
// reinterpret the bytes received as struct B, copy elision
const B* b_obj = reinterpret_cast<const B*>(buff);
b_obj->print();
// the above print call works correctly only if my system is big endian but not little endian
}
break;
case 0x0003: {
// reinterpret the bytes received as struct C, copy elision
const C* c_obj = reinterpret_cast<const C*>(buff);
c_obj->print();
// the above print call works correctly only if my system is big endian but not little endian
}
break;
default:
break;
}
}
}
char buff[8192]; // ... // reinterpret the bytes received as struct A, copy elision const A* a_obj = reinterpret_cast<const A*>(buff); a_obj->print();
In C++, this is always undefined behavior.
It is a strict aliasing violation because buff
(after array-to-pointer conversion) is a pointer to a char
object, but you are accessing its print
member through the type A
.
That's not allowed; see [expr.ref] p9.
buff
is also possibly under-aligned, and you need to use alignas(A)
to ensure proper alignment for the A
object potentially inside.
C++23 added a function which is perfect for this use case: std::start_lifetime_as
:
alingas(A) char buff[8192];
// ...
const A* a_obj = std::start_lifetime_as<A>(buff);
a_obj->print();
For this to work, A
needs to be an implicit-lifetime class type, which looks possible in your case.
Unfortunately, no compiler implements std::start_lifetime_as
at the time of writing, so you'll need a workaround that's technically UB, but not obviously so:
alingas(A) char buff[8192];
// ...
const A* a_obj = std::launder(reinterpret_cast<A*>(buff));
a_obj->print();
With std::launder
, you're turning a pointer to a char
object into a pointer to an A
object at the same address.
This is still undefined behavior if there isn't actually an A
object there, but you can assume that recvfrom
puts an A
there.
At least, the compiler cannot prove that recvfrom
has not put an A
there, so it will do what you want.
Keep in mind that all of these approaches assume that the given object is already in native byte order, not in network byte order. No Endianness conversion is being performed here.
To get meaningful values for your data members, you would need to correct their byte order after reinterpretation (no matter which approach), e.g. by using std::byteswap
for each member.