mongodbmongodb-queryaggregation-frameworkmongodb-indexes

Is there a way to filter array of subdocuments in the $match stage?


db.col.aggregate([
  {
    $match: {
      field_nest: { $elemMatch: { /* conditions */ } }
    }
  }
])

This is my current set up. In addition to matching the parent document, it needs to also return only the subdocument that matches the $elemMatch.

Otherwise, I would have to $unwind and $match again. But this would no long be able to use the index. The idea is to be able to use the indexes.


Solution

  • No the $match stage selects documents to pass along the pipeline, it does not modify the documents being passed along.

    You can use $elemMatch in the $match stage to select the documents, and then use $filter in an $addFields stage to filter out the non-matching elements.

    Perhaps something like:

    db.col.aggregate([
        {$match: {
           field_nest: { $elemMatch: { /* conditions */ } }
        }},
        {$addFields: {
           field_nest: {$filter:{
              input: "$field_nest",
              as: "item",
              cond: { /* conditions */ }
           }}
        }}
    ])
    

    This may be able to use an index, depending on the exact conditions and available indexes.

    For example, if the query were

    db.col.aggregate([
      {$match: {
        field_nest:{$elemMatch:{a:1,b:2}}
      }}
    ])
    

    It could use an index on {"field_nest.a":1,"field_nest.b":1}, but it could not use an index on {field_nest:1} or {"field_next.c":1, "field_next.a":1}.

    If the query were

    db.col.aggregate([
      {$match: {
        top_field: "name",
        some_value: {$gte: "never"},
        field_nest:{$elemMatch:{a:1,b:2}}
      }}
    ])
    

    the query executor would look at all of the available indexes, but may use an index that does not include the array field.

    If the query were

      {$match: {
        top_field: "name",
        some_value: {$gte: "never"},
        field_nest:{$elemMatch:{a:{$regex:"unanchored"},b:2}}
      }}
    ])
    

    it would not be able to use an index for selecting field_nest.a, but might be able to use one for field_nest.b.

    The determination of whether or not an index will be used depends greatly on the exact nature of the match conditions and available indexes.