c++mongodbc++14mongo-cxx-drivermongo-c-driver

mongocxx count documents in collection


Counting the number of documents in a collection is easy and fast (apparently constant time) in the shell.

> db.my_collection.count()
12345678
>

In C++ I try this:

mongocxx::client client;
mongocxx::database db = MongoInit(client, ...);
vector<string> collection_names;
mongocxx::cursor cursor = db.list_collections();
for (const bsoncxx::document::view& doc : cursor) {
    string collection_name = doc["name"].get_utf8().value.to_string();
    collection_names.push_back(collection_name);
}

bsoncxx::document::view empty_filter;
for (const string& collection_name : collection_names) {
    LOG_INFO << collection_name;
    mongocxx::collection collection = db[collection_name];
    int64_t collection_count = collection.count_documents(empty_filter);
    LOG_INFO << collection_name << "    " << collection_count;
}

This code works but is strangely slow. Have I done something wrong?


Solution

  • count and count_documents are very different functions.

    MongoDB maintains metadata about each collection which contains the number of documents stored. This number is incremented when a document is inserted, and decremented when one is deleted. It is possible for this number to get out of sync with the collection, so it should be treated as an approximation.

    The count simply reads that number from the metadata and returns it, allowing it to complete in constant time.

    The count_documents function scans the collection to get the exact document count, rather than the approximate count from metadata.

    If you need the result to be fast rather than precise, use the estimated_document_count function.