node.jsmongodbmongodb-queryaggregation-frameworkmongodb-indexes

mongodb not using index unless field is specified in $match


compound index

{
  A: 1,
  B: 1
}

In this query, the compound index is not used:

db.user.aggregate([
  { $match: {
      B: { $gt: 100 }
  } },
  {
    $sort: {
      A: 1
    }
  },
  {
    $limit: 1000
  }
])

I need do this in order to get it to use the index:

db.user.aggregate([
  { $match: {
      A: { $exists: true }
      B: { $gt: 100 }
  } },
  {
    $sort: {
      A: 1
    }
  },
  {
    $limit: 1000
  }
])

Adding A: { $exists: true } is unnecessary because all documents will have this field. I thought that mongodb was smart enough to use an index to sort?


Solution

  • The claim as written seems incorrect or incomplete. When I attempt to reproduce this on version 6.0.1 I see the index being used* as opposed to a collection scan:

    > db.version()
    6.0.1
    > db.user.createIndex({
    ...           "A": 1,
    ...           "B": 1
    ...         })
    A_1_B_1
    > db.user.aggregate([
    ...   {
    .....     $match: {
    .......       B: {
    .........         $gt: 100
    .........       }
    .......     }
    .....   },
    ...   {
    .....     $sort: {
    .......       A: 1
    .......     }
    .....   },
    ...   {
    .....     $limit: 1000
    .....   }
    ... ]).explain().queryPlanner.winningPlan
    {
      stage: 'LIMIT',
      limitAmount: 1000,
      inputStage: {
        stage: 'FETCH',
        filter: { B: { '$gt': 100 } },
        inputStage: {
          stage: 'IXSCAN',
          keyPattern: { A: 1, B: 1 },
          indexName: 'A_1_B_1',
          isMultiKey: false,
          multiKeyPaths: { A: [], B: [] },
          isUnique: false,
          isSparse: false,
          isPartial: false,
          indexVersion: 2,
          direction: 'forward',
          indexBounds: { A: [ '[MinKey, MaxKey]' ], B: [ '[MinKey, MaxKey]' ] }
        }
      }
    }
    

    We can similarly see the index used in this mongoplayground demonstration.

    So more specific details about your situation would be required in order to advise further on the question as written. Is collation involved perhaps? But the general answer to your question is that MongoDB can (and does) use the index in the situation that was described which aligns with the documentation referenced in the other answerv

    *Notably the index is not used as efficiently as it could be, and so it is probably separately a good idea to include the $exists clause to improve performance anyway. It just shouldn't strictly be required in order to get the index to be used in the first place.


    Edit

    Based on your comment, it sounds like the corrected version of your problem statement and question is:

    MongoDB uses the index with the aggregation as written, but it does not use it as efficiently as compared to when the A: { $exists: true } clause is added. Why is that?

    This is related to the note at the end of my original answer. It appears to be a deficiency in the current querying system, specifically the one tracked here. Adding the $exists clause (or one logically equivalent) to the preceding field in the index is effectively the workaround.