c++msgpack

MessagePack C++ - How to iterate through an unknown data structure?


I want to share structured data between C++ and Python languages using MessagePack like this one:

{
  "t" : [ [t00,...,t0N], ... , [tM0,...,tMN] ],
  "x" : [ x0,..,xN],
  "P" : [ [P00, ..., P0N], ..., [PM0,...,PMN] ]
}

The number of variables is optional so in some cases I will have for example only:

{
 "t" : [ [t00,...,t0N], ... , [tM0,...,tMN] ]
}

Decoding this in Python is pretty simple, my problem is to figure out how to decode this in C++ if I don't know in advance the structure of the data ? or the exact number of variables that I would have; is it possible to iterate the structure in these cases?

I managed to handle a "fixed" data structure ( always with the same number of variables ) defining a struct for example:

struct variables
{
   std::vector< std::vector<double> > t;
   std::vector< double > x;
   std::vector< std::vector<double> > P;
   MSPACK_DEFINE_MAP( t, x, P );
};

std::stringstream inBuffer;

.... (read data )

std::string str( inBuffer.str() );
msgpack::object_handle oh = msgpack::unpack( str.data(), str.size() );
msgpack::object deserialized = oh.get();

variables var;
deserialized.convert( var );

Is there a better way to accomplish this ?, how could manage optional variables that could not appear in the structure ?; I repeat the previous question: could I iterate an unknown data structure in C++?, how ?

Thanks in advance!

Regards, Ernesto


Solution

  • There are two ways to treat unknown data structure.

    The first way is using parse/visitor mechanism. Here is an example:

    #include <msgpack.hpp>
    #include <sstream>
    #include <iostream>
    
    // This is a simple print example visitor.
    // You can do any processing in your visitor.
    struct my_visitor : msgpack::null_visitor {
        bool start_map_key() {
            processing_map_key = true;
            return true;
        }
        bool end_map_key() {
            processing_map_key = false;
            return true;
        }
        bool start_array(uint32_t size) {
            std::cout << "array (size:" << size << ")[" << std::endl;
            return true;
        }
        bool end_array() {
            std::cout << "]" << std::endl;
            return true;
        }
    
        bool visit_str(const char* v, uint32_t size) {
            if (processing_map_key) {
                std::cout << "map key:" << std::string(v, size) << std::endl;
            }
            return true;
        }
        bool visit_positive_integer(uint64_t v) {
            std::cout << "found value:" << v << std::endl;
            return true;
        }
    
        bool processing_map_key = false;
        std::string indent;
    };
    
    
    int main() {
        // create test data
        std::stringstream ss;
        msgpack::packer<std::stringstream> pk(ss);
        pk.pack_map(1);
        pk.pack("t");
        pk.pack_array(2);
        pk.pack_array(3);
        pk.pack(1);
        pk.pack(2);
        pk.pack(3);
        pk.pack_array(3);
        pk.pack(4);
        pk.pack(5);
        pk.pack(6);
    
        // print data (for debug)
        {
            auto oh = msgpack::unpack(ss.str().data(), ss.str().size());
            std::cout << oh.get() << std::endl;
        }
    
        // apply visitor
        {
            my_visitor mv;
            msgpack::parse(ss.str().data(), ss.str().size(), mv);
        }
    }
    

    Running demo: https://wandbox.org/permlink/3NrR4IMDIuLTk9e9

    See https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_visitor.

    The other way is using msgpack::type::variant or `msgpack::type::variant_ref. The former copies data, you can update it. The latter doesn't copy data. You cannot update it. This approach requires boost. So you need to define MSGPACK_USE_BOOST. I recommend defining as a compiler option.

    // Boost is required
    #define MSGPACK_USE_BOOST
    
    #include <msgpack.hpp>
    #include <sstream>
    #include <iostream>
    
    struct my_visitor:boost::static_visitor<void> {
        void operator()(uint64_t v) const {
            std::cout << "positive insteger:" << v << std::endl;
        }
        // const is required for map key because std::multimap's key (first) is const.
        void operator()(std::string const& v) const {
            std::cout << "string:" << v << std::endl;
        }
        void operator()(std::vector<msgpack::type::variant>& v) const {
            std::cout << "array found" << std::endl;
            for (auto& e : v) {
                boost::apply_visitor(*this, e);
            }
        }
        void operator()(std::multimap<msgpack::type::variant, msgpack::type::variant>& v) const {
            std::cout << "map found" << std::endl;
            for (auto& e : v) {
                std::cout << "key:" << std::endl;
                boost::apply_visitor(*this, e.first);
                std::cout << "value:" << std::endl;
                boost::apply_visitor(*this, e.second);
            }
        }
        template <typename T>
        void operator()(T const&) const {
            std::cout << "  match others" << std::endl;
        }
    };
    
    int main() {
        // create test data
        std::stringstream ss;
        msgpack::packer<std::stringstream> pk(ss);
        pk.pack_map(1);
        pk.pack("t");
        pk.pack_array(2);
        pk.pack_array(3);
        pk.pack(1);
        pk.pack(2);
        pk.pack(3);
        pk.pack_array(3);
        pk.pack(4);
        pk.pack(5);
        pk.pack(6);
    
        auto oh = msgpack::unpack(ss.str().data(), ss.str().size());
        std::cout << oh.get() << std::endl;
    
        msgpack::type::variant v = oh.get().as<msgpack::type::variant>();
        boost::apply_visitor(my_visitor(), v);
    }
    

    Running demo: https://wandbox.org/permlink/HQwJjfwW8rLEMi0d

    See https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_variant

    Here are exampless: https://github.com/msgpack/msgpack-c/blob/master/example/boost/msgpack_variant_capitalize.cpp https://github.com/msgpack/msgpack-c/blob/master/example/boost/msgpack_variant_mapbased.cpp

    Both ways can treat unpredictable data structure. You need to do some visitor processing. If the data structure is predictable some extent, your original approach is also good way.