c++performancenetwork-programmingtemplate-meta-programminglow-latency

How should I approach parsing the network packet using C++ template?


Let's say I have an application that keeps receiving the byte stream from the socket. I have the documentation that describes what the packet looks like. For example, the total header size, and total payload size, with the data type corresponding to different byte offsets. I want to parse it as a struct. The approach I can think of is that I will declare a struct and disable the padding by using some compiler macro, probably something like:

struct Payload
{
   char field1;
   uint32 field2;
   uint32 field3;
   char field5;
} __attribute__((packed));

and then I can declare a buffer and memcpy the bytes to the buffer and reinterpret_cast it to my structure. Another way I can think of is that process the bytes one by one and fill the data into the struct. I think either one should work but it is kind of old school and probably not safe.

The reinterpret_cast approach mentioned, should be something like:

void receive(const char*data, std::size_t data_size)
{
    if(data_size == sizeof(payload)
    {
        const Payload* payload = reinterpret_cast<const Payload*>(data);
       // ... further processing ...
    }
}

I'm wondering are there any better approaches (more modern C++ style? more elegant?) for this kind of use case? I feel like using metaprogramming should help but I don't have an idea how to use it.

Can anyone share some thoughts? Or Point me to some related references or resources or even relevant open source code so that I can have a look and learn more about how to solve this kind of problem in a more elegant way.


Solution

  • There are many different ways of approaching this. Here's one:

    Keeping in mind that reading a struct from a network stream is semantically the same thing as reading a single value, the operation should look the same in either case.

    Note that from what you posted, I am inferring that you will not be dealing with types with non-trivial default constructors. If that were the case, I would approach things a bit differently.

    In this approach, we:

    #include <cstdint>
    #include <bit>
    #include <concepts>
    #include <array>
    #include <algorithm>
    
    // Use std::byteswap when available. In the meantime, just lift the implementation from 
    // https://en.cppreference.com/w/cpp/numeric/byteswap
    template<std::integral T>
    constexpr T byteswap(T value) noexcept
    {
        static_assert(std::has_unique_object_representations_v<T>, "T may not have padding bits");
        auto value_representation = std::bit_cast<std::array<std::byte, sizeof(T)>>(value);
        std::ranges::reverse(value_representation);
        return std::bit_cast<T>(value_representation);
    }
    
    template<typename T>
    concept DataSource = requires(T& x, char* dst, std::size_t size ) {
      {x.read(dst, size)};
    };
    
    // General read implementation for all arithmetic types
    template<std::endian network_order = std::endian::big>
    void read_into(DataSource auto& src, std::integral auto& dst) {
      src.read(reinterpret_cast<char*>(&dst), sizeof(dst));
    
      if constexpr (sizeof(dst) > 1 && std::endian::native != network_order) {
        dst = byteswap(dst);
      }
    }
    
    struct Payload
    {
       char field1;
       std::uint32_t field2;
       std::uint32_t field3;
       char field5;
    };
    
    // Read implementation specific to Payload
    void read_into(DataSource auto& src, Payload& dst) {
      read_into(src, dst.field1);
      read_into<std::endian::little>(src, dst.field2);
      read_into(src, dst.field3);
      read_into(src, dst.field5);
    }
    
    // mind you, nothing stops you from just reading directly into the struct, but beware of endianness issues:
    // struct Payload
    // {
    //    char field1;
    //    std::uint32_t field2;
    //    std::uint32_t field3;
    //    char field5;
    // } __attribute__((packed));
    // void read_into(DataSource auto& src, Payload& dst) {
    //   src.read(reinterpret_cast<char*>(&dst), sizeof(Payload));
    // }
    
    // Example
    struct some_data_source {
      std::size_t read(char*, std::size_t size);
    };
    
    void foo() {
        some_data_source data;
    
        Payload p;
        read_into(data, p);
    }
    

    An alternative API could have been dst.field2 = read<std::uint32_t>(src), which has the drawback of requiring to be explicit about the type, but is more appropriate if you have to deal with non-trivial constructors.

    see it in action on godbolt: https://gcc.godbolt.org/z/77rvYE1qn