I have 2 collections users
and files
. A user document looks like this :
{
"_id": "user1",
"name": "John"
}
A file
document looks like this :
{
"_id": "fileID1",
"url": "any_url",
"userID": "user1"
}
I have a list of user IDs let's say : ["user1", "user2", "user5"]
and i want to perform a single query that takes this list
as input and returns 5 files of each user ID.
For now i have a naive logic that is not really performant :
result_users = users_collection.find({}).limit(15)
for user in result_users:
user_id.append(user["_id"])
result_files = files_collection.find({"userID": user["_id"]}).limit(5)
for f in result_files:
file_dict[user["_id"]].append(f)
The query i am looking for would naively be something like :
result_files = files_collection.find({"userID": {"$in / $each": user_ids_list}).limit(5)
The output i am looking for is :
{
"user1": ["fileID1", "fileID2", "fileID8", "fileID3"],
"user2": ["fileID4", "fileID5", "fileID6", "fileID9", "fileID7"],
...
}
With at max 5 files per userID.
I am pretty sure this is achievable with aggregation
but i have no clue how to write it correctly.
Note : i don't really care about the output format, i only need a list of file that is sorted correctly with 5 files per userID.
I can map the output myself when i am confident the query returns the right files
.
Thanks !
Edit : it would be amazing to add a sort("at", DESCENDING) on the files collection aggregation
since you already have the user names in the file docs you won't need to query user collection.
$match
based on existence of user in input arrayuserID
field and push the file id and the timestamp (will need later for sorting)$sortArray
on the grouped file array (v). Now it is sorted in descending(-1). On that sorted call $slice
and finally map over the sliced array to get an array of just the ids.
this will now generate in the format [ { _id:"user1",v: [...]}, { _id:"user2",v: [...]}... ]
db.files.aggregate([
{
$match: { userID: { $in: ["user1", "user2", "user5"] } }
},
{
$group: {
_id: "$userID",
v: { $push: { id: "$_id", at: "$at" } }
}
},
{
$project: {
v: {
$map: {
input: { $slice: [{ $sortArray: { input: "$v", sortBy: { at: -1 } } }, 5] },
in: "$$this.id"
}
}
}
}
])
if you want to further convert it to { "user1": [...], "user2": [...], ... }
format you can add these 2 stages
{
$group: {
_id: null,
kv: { $push: { k: "$_id", v: "$v" } }
}
},
{
$replaceRoot: { newRoot: { $arrayToObject: "$kv" } }
}