I have the following Mongo collection:
[
{
"key": 1,
"user": "A",
"comment": "commentA1"
},
{
"key": 2,
"user": "A",
"comment": "commentA2"
},
{
"key": 5,
"user": "A",
"comment": "commentA5"
},
{
"key": 2,
"user": "B",
"comment": "commentB2"
},
{
"key": 3,
"user": "B",
"comment": "commentB3"
},
{
"key": 6,
"user": "B",
"comment": "commentB6"
}
]
and I need to find the first continuous keys, with no gaps, per user. So, for user A I should get the first 2 documents, and for user B the first two also. The collection might contain more than 2M documents, so the query should work fast.
I have found SQL solutions for this problem (http://www.silota.com/docs/recipes/sql-gap-analysis-missing-values-sequence.html in section number 3), but I am looking for a Mongo solution.
How can I do it in Mongo 4.0 (DocumentDB) ?
EDIT: according to further elaboration on the comments,
One option is:
db.collection.aggregate([
{$sort: {key: 1}},
{$group: {
_id: "$user",
data: {$push: {key: "$key", comment: "$comment"}},
shadow: {$push: {$add: ["$key", 1]}}
}},
{$project: {
data: 1,
shadow: {$filter: {input: "$shadow", cond: {$in: ["$$this", "$data.key"]}}}
}},
{$project: {data: 1, shadow: 1, firstItem: {$subtract: [{$first: "$shadow"}, 1]}}},
{$project: {data: 1, firstItem: 1, shadow: {$concatArrays: [["$firstItem"], "$shadow"]}}},
{$project: {
data: 1,
shadow: {$reduce: {
input: {$range: [0, {$size: "$shadow"}]},
initialValue: [],
in: {
$concatArrays: [
"$$value",
{$cond: [
{$eq: [
{$arrayElemAt: ["$shadow", "$$this"]},
{$add: ["$$this", "$firstItem"]}
]},
[{$arrayElemAt: ["$shadow", "$$this"]}],
[]
]},
]
}
}
}
}
},
{$project: {data: {$filter: {input: "$data", cond: {$in: ["$$this.key", "$shadow"]}}}}},
{$unwind: "$data"},
{$project: {comment: "$data.comment", key: "$data.key"}}
])
See how it works on the playground example