I trying to query mongo collection with unique column and pagination using mongoose.
I have pagination data that I need to know the total unique counts, offset/page number but the query is needs to be executed without $skip
and $limit
, which returns entire collection and eating up memory just to get the count. I tried using distinct
await Token.find().distinct('token_id').countDocuments().exec()
which is not giving the correct unique count.
Its been a while working on mongo, from what I learnt that aggregates collects all the records, then filters/group and sends the results back. Which is fine for me, as long as the node app is not doing any memory intensive operations.
So, how do I achieve the total unique records (without fetching the collection if possible just the count) and pagination to work in right way.
Below is the mongoose model Token
and grouping by token_id
to fetch unique records. (its expected that there might be duplicates token_id
)
const getUniqueTokens = async () => {
return Token.aggregate([{
$group: {
_id: `$token_id`,
// Get the first document for each unique field
doc: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
// Replace root to get back the original document structure
newRoot: "$doc"
}
},
{
$skip: offset,
},
{
$limit: 100
}
]).exec();
};
Your aggregation pipeline is mostly fine - you should add a $sort
stage so that you're reliably paginating 1, 2, 3, 4 always. Instead of 3, 2, 1, 4 sometimes and 2, 4, 1, 3 other times. (Btw, your aggregation pipeline has backticks around `$token_id` - it should be single or double quotes as I've done below.)
Get the total unique records by putting a $count
stage after the group and removing everything else. Execute that as a separate aggregation query.
Aggregation pipeline and Mongo Playground for pagination:
db.token.aggregate([
{
$group: {
_id: "$token_id",
// Get the first document for each unique field
doc: { $first: "$$ROOT" }
}
},
{
$replaceRoot: {
// Replace root to get back the original document structure
newRoot: "$doc"
}
},
{ $sort: { token_id: 1 } }, // ADDED THIS STAGE
{ $skip: offset },
{ $limit: 100 }
])
Aggregation pipeline and Mongo Playground for the total:
db.token.aggregate([
{
$group: {
_id: "$token_id",
// Get the first document for each unique field
doc: {
$first: "$$ROOT"
}
}
},
{ $count: "total" }
])
You could do both in one aggregation using a $unionWith
but I don't recommend that, since getting the result and then extracting the "total" item separately from the actual documents is an unnecessary complication.