I am new to MongoDb. I need help to fetch the last n month record, there might be multiple entry per month but the query needs to return only the last entry per month.
For e.g lets say if n is 3
and userId is userId1
(that means return last 3 month record for userId1
).
Sample inputs in the collection:
[
{
"_id": objectId("aaaaaa"),
"userId": "userId1",
"processedAt": "2021-06-01T12:16:49.349Z"
},
{
"_id": objectId("bbbbb"),
"userId": "userId1",
"processedAt": "2021-10-11T12:16:49.349Z"
},
{
"_id": objectId("ccccc"),
"userId": "userId1",
"processedAt": "2021-10-25T12:16:49.349Z"
},
{
"_id": objectId("eeeee"),
"userId": "userId1",
"processedAt": "2021-09-12T12:16:49.349Z"
},
{
"_id": objectId("fffff"),
"userId": "userId1",
"processedAt": "2021-09-28T12:16:49.349Z"
},
{
"_id": objectId("ggggg"),
"userId": "userId1",
"processedAt": "2021-09-23T12:16:49.349Z"
},
{
"_id": objectId("hhhhh"),
"userId": "userId1",
"processedAt": "2021-07-23T12:16:49.349Z"
},
{
"_id": objectId("iiiii"),
"userId": "userId2",
"processedAt": "2021-09-29T12:16:49.349Z"
},
{
"_id": objectId("jjjjj"),
"userId": "userId1",
"processedAt": "2022-01-29T12:16:49.349Z"
},
{
"_id": objectId("kkkkk"),
"userId": "userId1",
"processedAt": "2022-02-29T12:16:49.349Z"
},
]
Expected Result: Should return by userId, limit n months(fetch only the last saved entry of the month) and the ascending order of the month of processedAt:
[{
"_id": objectId("ccccc"),
"userId": "userId1",
"processedAt": "2021-10-25T12:16:49.349Z"
},
{
"_id": objectId("jjjjj"),
"userId": "userId1",
"processedAt": "2022-01-29T12:16:49.349Z"
},
{
"_id": objectId("kkkkk"),
"userId": "userId1",
"processedAt": "2022-02-29T12:16:49.349Z"
}
]
I have tried below query however which is returning all the records. I want query needs to consider only the last entry per month. I have been using mongojs driver v4.1.2
db.collection(collection_name)
.find({ userId: userId }, { projection: { _id: 0 } })
.sort({ processedAt: -1 })
.limit(n)
.toArray()
You can use $setWindowFields to aggregate a "rank" for the "partition" / "group" (i.e. the month in your example) and only choose the document with top rank.
The ranking can be defined as processedAt: -1
as you want to keep only the latest record in the month with highest rank.
{
"$setWindowFields": {
"partitionBy": {
"$dateToString": {
"date": "$processedAt",
"format": "%Y-%m"
}
},
"sortBy": {
"processedAt": -1
},
"output": {
"rank": {
$rank: {}
}
}
}
}
Here is the Mongo playground for your reference.
As the sample dataset is using ISODate format, it is possible to sort and group the field by leftmost 7 characters (i.e. yyyy-MM). Keeping only the first document inside the month group should do the tricks.
{
$sort: {
processedAt: -1
}
},
{
"$addFields": {
"month": {
"$substrCP": [
"$processedAt",
0,
7
]
}
}
},
{
$group: {
_id: "$month",
last: {
$first: "$$ROOT"
}
}
}
Here is the Mongo playground.