mongodbaggregationmongodb-atlasmongodb-nodejs-driver

Why does allowDiskUse not working in aggregation pipeline for $group stage?


I was working on an aggregation pipeline to cleanup duplicate data from a collection on the database. But it was throwing the following error even after setting { allowDiskUse: true }.

Err: "MongoServerError: PlanExecutor error during aggregation :: caused by :: Exceeded memory limit for $group, but didn't allow external spilling; pass allowDiskUse:true to opt in"

The objective was to delete all duplicates (same phoneNumber with same startTime) after retaining the first entry from the database.

Below is my pipeline:

let pipeline = [
    {
        $group: {
            _id: {
                phoneNumber: "$details.phoneNumber",
                startTime: "$details.startTime",
            },
            docs: { $push: "$$ROOT" },
            count: { $sum: 1 },
        },
    },
    {
        $match: {
            count: { $gt: 1 },
        },
    },
    {
        $unwind: "$docs",
    },
    {
        $sort: {
            "docs.createdAt": 1,
        },
    },
    {
        $skip: 1,
    },
    {
        $replaceRoot: { newRoot: "$docs" },
    },
]

Here, while calling aggregate, i passed in "allowDiskUse" as mentioned in the types as below, but still it's throwing the same error.

This is how i added the option:

const options = {
    allowDiskUse: true,
    maxTimeMS: 10000,
}

And then using mongodb node driver v5.8, i used aggregation operation like below:

db.collection(collectionName).aggregate(pipeline, options)

Here, i added allowDiskUse like the error message suggested, but still it's not working. Even in mongodb compass its throwing the same error.

compass error

So, what's the issue here?

And if i could do my objective in any other way, please comment.

Thanks


Solution

  • Please make sure that you are not running on a MongoDB Atlas shared cluster (M0, M2, M5). For these, allowDiskUse cannot be enabled.

    See this link on the limitations of the shared clusters.