I was working on an aggregation pipeline to cleanup duplicate data from a collection on the database. But it was throwing the following error even after setting { allowDiskUse: true }
.
Err: "MongoServerError: PlanExecutor error during aggregation :: caused by :: Exceeded memory limit for $group, but didn't allow external spilling; pass allowDiskUse:true to opt in"
The objective
was to delete all duplicates (same phoneNumber with same startTime) after retaining the first entry from the database.
Below is my pipeline:
let pipeline = [
{
$group: {
_id: {
phoneNumber: "$details.phoneNumber",
startTime: "$details.startTime",
},
docs: { $push: "$$ROOT" },
count: { $sum: 1 },
},
},
{
$match: {
count: { $gt: 1 },
},
},
{
$unwind: "$docs",
},
{
$sort: {
"docs.createdAt": 1,
},
},
{
$skip: 1,
},
{
$replaceRoot: { newRoot: "$docs" },
},
]
Here, while calling aggregate, i passed in "allowDiskUse" as mentioned in the types as below, but still it's throwing the same error.
This is how i added the option:
const options = {
allowDiskUse: true,
maxTimeMS: 10000,
}
And then using mongodb node driver v5.8
, i used aggregation operation like below:
db.collection(collectionName).aggregate(pipeline, options)
Here, i added allowDiskUse like the error message suggested, but still it's not working. Even in mongodb compass its throwing the same error.
So, what's the issue here?
And if i could do my objective in any other way, please comment.
Thanks
Please make sure that you are not running on a MongoDB Atlas shared cluster (M0, M2, M5). For these, allowDiskUse
cannot be enabled.
See this link on the limitations of the shared clusters.