c++endiannessuint8tuint16

Converting uint8_t* buffer to uint16_t and changing endianness


I'd like to process data provided by an external library.

The lib holds the data and provides access to it like this:

const uint8_t* data;
std::pair<const uint8_t*, const uint8_t*> getvalue() const {
  return std::make_pair(data + offset, data + length);
}

I know that the current data contains two uint16_t numbers, but I need to change their endianness. So altogether the data is 4 bytes long and contains this numbers:

66 4 0 0

So I'd like to get two uint16_t numbers with 1090 and 0 value respectively.

I can do basic arithmetic and in one place change the endianness:

pair<const uint8_t*, const uint8_t*> dataPtrs = library.value();
vector<uint8_t> data(dataPtrs.first, dataPtrs.second);

uint16_t first = data[1] <<8 + data[0]
uint16_t second = data[3]<<8 + data[2]

However I'd like to do something more elegant (the vector is replaceable if there is better way for getting the uint16_ts).

How can I better create uint16_t from uint8_t*? I'd avoid memcpy if possible, and use something more modern/safe.

Boost has some nice header-only endian library which can work, but it needs an uint16_t input.

For going further, Boost also provides data types for changing endianness, so I could create a struct:

struct datatype {
    big_int16_buf_t     data1;
    big_int16_buf_t     data2;
}

Is it possible to safely (paddings, platform-dependency, etc) cast a valid, 4 bytes long uint8_t* to datatype? Maybe with something like this union?

typedef union {
    uint8_t u8[4];
    datatype correct_data;
} mydata;

Solution

  • Maybe with something like this union?

    No. Type punning with unions is not well defined in C++.

    This would work assuming big_int16_buf_t and therefore datatype is trivially copiable:

    datatype d{};
    std::memcpy(&d, data, sizeof d);
    
    uint16_t first = data[1] <<8 + data[0]
    uint16_t second = data[3]<<8 + data[2]
    

    However I'd like to do something more elegant

    This is actually (subjectively, in my opinion) quite an elegant way because it works the same way on all systems. This reads the data as little endian, whether the CPU is little, big or some other endian. This is well portable.

    However I'd like to do something more elegant (the vector is replaceable if there is better way for getting the uint16_ts).

    The vector seems entirely pointless. You could just as well use:

    const std::uint8_t* data = dataPtrs.first;