mongodbmongodb-queryaggregation-frameworkmongodb-indexes

Is there a way to specify a bin width for multi column sorting in MongoDB?


This is my sort object:

{
  income: 1,
  age: 1
}

The problem is that income in theory is a real number, and hence unique. So this may render age to have no effect, unless income can be binned.

How to specify this in MongoDB aggregation? And would this binning operation, if possible, be too expensive?


Solution

  • This can be achieved fairly easily with an aggregation stage:

    db.collection.aggregate([
      {$set: { roundedIncome: { $floor: "$income"}}},
      {$sort: {"roundedIncome": 1, "age": 1}}
    ])
    

    You can of course change $floor to any expression you want for the resolution you want (e.g., 1dp, etc).

    As you point out - there are performance considerations to this - the biggest one is if you hoped to use an index for the sort on income - this now won't be the case, which will cause an in-memory sort (with the limits that entails).

    The cost otherwise will be equivalent to 1 Math.floor operation per document looked at. This could end up being the entire collection - since you can't skip/limit until you've sorted and you can't sort until you've ran the $set.

    You could negate the cost to a certain extent by running the sort on income first, then the skip/limit, then the $set + $sort at the end - your results would be slightly inaccurate - but possibly acceptable