The MongoDB C++ driver allows two ways (among others) of creating BSON objects.
Based in stream:
auto builder = bsoncxx::builder::stream::document{};
bsoncxx::document::value doc_value = builder
<< "name" << "MongoDB"
<< "type" << "database"
<< "count" << 1
<< "versions" << bsoncxx::builder::stream::open_array
<< "v3.2" << "v3.0" << "v2.6"
<< close_array
<< "info" << bsoncxx::builder::stream::open_document
<< "x" << 203
<< "y" << 102
<< bsoncxx::builder::stream::close_document
<< bsoncxx::builder::stream::finalize;
Based in parsing a JSON string:
std::string doc = "{ "
"\"name\" : \"MongoDB\","
"\"type\" : \"database\","
"\"count\" : 1,"
"\"versions\": [ \"v3.2\", \"v3.0\", \"v2.6\" ],"
"\"info\" : {"
"\"x\" : 203,"
"\"y\" : 102"
"}"
"}";
bsoncxx::document::value bsoncxx::from_json(doc);
I would like to know which one is the most convenient from the point of view of performance. I tend to think that the number of function calls involved by the stream alternative "under the hood" will be worse than procesing the JSON string but it could be the other way around or be equal.
I have tried to find some information about this in the MongoDB C++ driver documentation with no luck. Any information is really welcomed... thanks in advance!
I did some benchmarking at the end. I'm sharing my results in the case they can be useful for others. Driver veresion is 3.4.0.
This is the stream based version:
#include <iostream>
#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
int main(int, char**) {
mongocxx::instance inst{};
mongocxx::client conn{mongocxx::uri{}};
for (unsigned int ix = 0; ix < 10000000 ; ++ix) {
auto builder = bsoncxx::builder::stream::document{};
bsoncxx::document::value doc_value = builder
<< "name" << "MongoDB"
<< "type" << "database"
<< "count" << 1
<< "versions" << bsoncxx::builder::stream::open_array
<< "v3.2" << "v3.0" << "v2.6"
<< bsoncxx::builder::stream::close_array
<< "info" << bsoncxx::builder::stream::open_document
<< "x" << 203
<< "y" << 102
<< bsoncxx::builder::stream::close_document
<< bsoncxx::builder::stream::finalize;
}
}
This is the text parsing based version:
#include <iostream>
#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
int main(int, char**) {
mongocxx::instance inst{};
mongocxx::client conn{mongocxx::uri{}};
for (unsigned int ix = 0; ix < 10000000 ; ++ix) {
std::string doc = "{ "
"\"name\" : \"MongoDB\","
"\"type\" : \"database\","
"\"count\" : 1,"
"\"versions\": [ \"v3.2\", \"v3.0\", \"v2.6\" ],"
"\"info\" : {"
"\"x\" : 203,"
"\"y\" : 102"
"}"
"}";
bsoncxx::document::value doc_value = bsoncxx::from_json(doc);
}
}
As you see, the structure of the program and the number of iterations (10,000,000) is the same in both cases.
Compiled using:
c++ --std=c++11 test-stream.cpp -o test-stream $(pkg-config --cflags --libs libmongocxx)
c++ --std=c++11 test-textparsing.cpp -o test-textparsing $(pkg-config --cflags --libs libmongocxx)
The results with test-stream (three times):
$ time ./test-stream ; time ./test-stream ; time ./test-stream
real 0m16,454s
user 0m16,200s
sys 0m0,084s
real 0m17,034s
user 0m16,900s
sys 0m0,012s
real 0m18,812s
user 0m18,708s
sys 0m0,036s
The results with test-textparsing (also three times):
$ time ./test-textparsing ; time ./test-textparsing ; time ./test-textparsing
real 0m53,678s
user 0m53,576s
sys 0m0,024s
real 1m0,203s
user 0m59,788s
sys 0m0,116s
real 0m57,259s
user 0m56,824s
sys 0m0,200s
Conclusion: the stream based strategy outperforms text-based by large.
A peer check of the experiment would be great to confirm results ;)
EDIT: I have added a test case based in the basic builder:
#include <iostream>
#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
using bsoncxx::builder::basic::kvp;
int main(int, char**) {
mongocxx::instance inst{};
mongocxx::client conn{mongocxx::uri{}};
for (unsigned int ix = 0; ix < 10000000 ; ++ix) {
bsoncxx::builder::basic::document basic_builder{};
basic_builder.append(kvp("name", "MongoDB"));
basic_builder.append(kvp("type", "database"));
basic_builder.append(kvp("count", 1));
bsoncxx::builder::basic::array array_builder{};
array_builder.append("v3.2");
array_builder.append("v3.0");
array_builder.append("v2.6");
basic_builder.append(kvp("versions", array_builder.extract()));
bsoncxx::builder::basic::document object_builder{};
object_builder.append(kvp("x", 203));
object_builder.append(kvp("y", 102));
basic_builder.append(kvp("info", object_builder.extract()));
bsoncxx::document::value doc_value = basic_builder.extract();
}
}
compiled this way:
c++ --std=c++11 test-basic.cpp -o test-basic $(pkg-config --cflags --libs libmongocxx)
I have run again the tests with these results:
basic
-----
real 0m20,725s
user 0m20,656s
sys 0m0,004s
real 0m20,651s
user 0m20,620s
sys 0m0,008s
real 0m20,102s
user 0m20,088s
sys 0m0,000s
stream
------
real 0m11,841s
user 0m11,780s
sys 0m0,024s
real 0m11,967s
user 0m11,932s
sys 0m0,008s
real 0m11,634s
user 0m11,616s
sys 0m0,008s
textparsing
-----------
real 0m37,209s
user 0m37,184s
sys 0m0,004s
real 0m36,336s
user 0m36,208s
sys 0m0,028s
real 0m35,840s
user 0m35,648s
sys 0m0,048s
Conclusions:
I'd have bet before starting the experiment that basic build will win, but it was stream-based at the end. Maybe there is something woring on my test-basic.cpp
code? Or the result makes sense?