mongodbaggregation-frameworkfaceted-search

Perform search with facets unknown upfront Atlas MongoDB


I have the following document structure in MongoDB:

{
  // other keys,
  tags: [
    tagA: "red",
    tagB: "green"
  ]
},
{
  // other keys,
  tags: [
    tagA: "orange",
    tagB: "green",
    tagC: "car"
  ]
}

I want to perform a $facets search that gives me the following output (name of each tag + values that occur on that tag + count of these value):

{
  [
    tagA: {
      red: 1,
      orange: 1
    },
    tagB: {
      green: 2
    },
    tagC: {
      car: 1
    }
  ]   
}

The tricky part is that the facets are unknown upfront (they can vary), and every tutorial I found only works for a predefined set of facets.

Is it possible?

P.S.: how to get the output of this to come alongside with a given query? So that the return is something like:

{
  queryResults: [all the results, as in a normal query],
  facets: [result showed in accepted answer]
}

Solution

  • If you consider having this as input (i've added bracket around object in your array) :

    [
      {
        tags: [
          {
            tagA: "red"
          },
          {
            tagB: "green"
          }
        ]
      },
      {
        tags: [
          {
            tagA: "orange"
          },
          {
            tagB: "green"
          },
          {
            tagC: "car"
          }
        ]
      }
    ]
    

    You could then do an aggregation pipeline as follow :

    db.collection.aggregate([
      {
        "$unwind": "$tags"
      },
      {
        "$addFields": {
          "kv": {
            "$objectToArray": "$tags"
          }
        }
      },
      {
        "$unwind": "$kv"
      },
      {
        "$group": {
          "_id": {
            key: "$kv.k",
            value: "$kv.v"
          },
          "count": {
            "$sum": 1
          }
        }
      },
      {
        "$group": {
          "_id": "$_id.key",
          "value": {
            "$push": {
              "k": "$_id.value",
              "v": "$count"
            }
          }
        }
      },
      {
        $project: {
          val: [
            {
              k: "$_id",
              v: {
                "$arrayToObject": "$value"
              }
            }
          ]
        }
      },
      {
        $project: {
          res: {
            "$arrayToObject": "$val"
          }
        }
      },
      {
        $replaceRoot: {
          newRoot: "$res"
        }
      }
    ])
    

    It would give you this result :

    [
      {
        "tagA": {
          "orange": 1,
          "red": 1
        }
      },
      {
        "tagB": {
          "green": 2
        }
      },
      {
        "tagC": {
          "car": 1
        }
      }
    ]
    

    You can see this on mongoplayground : https://mongoplayground.net/p/FZbM-BGJRBm Hope this answer your question.

    Detailled explanation :

    1. I use $unwind on the tags field in order to get one object per object in tags array.
    2. I use $objectToArray to get keys (tagsA, tagsB) as values.
    3. $unwind to go from an array to objets.
    4. $group with $sum accumulator to calculate the occurence of each unique combination.
    5. $group by tagsA,tagsB, etc with $push accumulator to add value in array (will be usufull afterwards)
    6. $arrayToObject to go from array to object
    7. Same
    8. $replaceRoot to display results better.

    If you want to understand more each step, consider reading mongo doc of each pipeline aggregator i used. You can also use the mongoplayground link above, delete some code to see what happens after each step.