c++vectorstructfile-iocstdio

how to read and write non-fixed-length structs to biniary file c++


I have vector of structs:

typedef struct
{
    uint64_t id = 0;
    std::string name;
    std::vector<uint64_t> data;
} entry;

That I want to write to file:

FILE *testFile = nullptr;
testFile = fopen("test.b", "wb");

However the normal method for read/write

fwrite(vector.data(), sizeof vector[0], vector.size(), testFile);
fread(vector.data(), sizeof(entry), numberOfEntries, testFile);

does not work as the size of entry can vary wildly depending on the contents of

std::string name;
std::vector<uint64_t> data;

so I would like methods and pointers about how to do read/writing of this data to/from files.


Solution

  • When dealing with non-fixed size data it's important to keep track of the size somehow. You can simply specify the amount of fixed size elements or byte size of whole structure and calculate needed values when reading the struct. I'm in favour of the first one though it can sometimes make debugging a bit harder.

    Here is an example how to make a flexible serialization system.

    struct my_data
    {
       int a;
       char c;
       std::vector<other_data> data;
    }
    
    template<class T>
    void serialize(const T& v, std::vector<std::byte>& out)
    {
       static_assert(false, "Unsupported type");
    }
    
    template<class T>
    requires std::is_trivially_copy_constructible_v<T>
    void serialize(const T& v, std::vector<std::byte>& out)
    {
       out.resize(std::size(out) + sizeof(T));
       std::memcpy(std::data(out) + std::size(out) - sizeof(T), std::bit_cast<std::byte*>(&v), sizeof(T));
    }
    
    template<class T>
    void serialize<std::vector<T>>(const std::vector<T>& v, std::vector<std::byte>& out)
    {
       serialize<size_t>(std::size(v), out); // add size
       for(const auto& e : v)
          serialize<T>(v, out);
    }
    
    template<>
    void serialize<my_data>(const my_data& v, std::vector<std::byte>& out)
    {
       serialize(v.a, out);
       serialize(v.c, out);
       serialize(v.data, out);
    }
    
    // And likewise you would do for deserialize
    
    int main()
    {
       std::vector<std::byte> data;
       my_data a;
       serialize(a, data);
    
       // write vector of bytes to file
    }
    

    This is a tedious job and there are already libraries that do it for you like Google's Flatbuffers, Google's Protobuf or a single header BinaryLove3. Some of them work out of the box with aggregate types (meaning all member variables are public). Here is an example of BinaryLove3 in action.

    #include <iostream>
    #include <vector>
    #include <string>
    #include <cstdint>
    #include <string>
    #include <list>
    
    #include "BinaryLove3.hpp"
    
    struct foo
    {
        uint32_t v0 = 3;
        uint32_t v1 = 2;
        float_t v2 = 2.5f;
        char v3 = 'c';
        struct
        {
            std::vector<int> vec_of_trivial = { 1, 2, 3 };
            std::vector<std::string> vec_of_nontrivial = { "I am a Fox!", "In a big Box!" };
            std::string str = "Foxes can fly!";
            std::list<int> non_random_access_container = { 3, 4, 5 };
        } non_trivial;
        struct
        {
            uint32_t v0 = 1;
            uint32_t v1 = 2;
        } trivial;
    };
    
    auto main() -> int32_t
    {
        foo out = { 4, 5, 6.7f, 'd', {{5, 4, 3, 2}, {"cc", "dd"}, "Fly me to the moon..." , {7, 8, 9}}, {3, 4} };
        auto data = BinaryLove3::serialize(bobux);
        
        foo in;
        BinaryLove3::deserialize(data, in);
        return int32_t(0);
    }