mongodbmongoosemongodb-querymongodb-atlasmongodb-atlas-search

Get a list of unique values from MongoDB Atlas Search before match filters are applied


I use MongoDB Atlas Search to search through a list of resources in my database (I'm using Mongoose, hence the slightly different syntax):

const allApprovedSearchResults = await Resource.aggregate([{
    $search: {
        compound: {
            should: [
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["title", "link", "creatorName"],
                        allowAnalyzedField: true,
                    }
                },
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["topics"],
                        allowAnalyzedField: true,
                        "score": { "boost": { "value": 2 } },
                    }
                }
                ,
                    {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["description"],
                        allowAnalyzedField: true,
                        score: { "boost": { "value": .2 } },
                    }
                }
            ]
        }
    }
}])
    .match(matchFilter)
    .exec();

const uniqueLanguagesInSearchResults = [...new Set(allApprovedSearchResults.map(resource => resource.language))];

The last line retrieves all unique languages in the results set. However, I want a list of all the languages before .match(matchFilter) is applied. Is there a way to do this without running a second search without the filters?


Solution

  • You can use a $facet after the $search:

    .aggregate([
      {
        $search: {
            compound: {
                should: [
                    {
                        wildcard: {
                            query: queryStringSegmented,
                            path: ["title", "link", "creatorName"],
                            allowAnalyzedField: true,
                        }
                    },
                    {
                        wildcard: {
                            query: queryStringSegmented,
                            path: ["topics"],
                            allowAnalyzedField: true,
                            "score": { "boost": { "value": 2 } },
                        }
                    }
                    ,
                        {
                        wildcard: {
                            query: queryStringSegmented,
                            path: ["description"],
                            allowAnalyzedField: true,
                            score: { "boost": { "value": .2 } },
                        }
                    }
                ]
            }
        }
    },
      {
        "$facet": {
          "filter": [
            {$match: matchFilter}
          ],
          "allLanguages ": [
            {$group: {_id: 0, all: {$addToSet: '$language'}}}, //<- replace '$language' with real field name
          ]
        }
      }
    ])
    

    You did not provide a structure so I'm assuming 'language' is the field name. The $facet creates a fork - one part called 'filter' will contain only the filtered results, while the other one, called allLanguages, will contain a set of all languages, regardless of the filter.You can add $project steps inside each $facet pipeline to format the data.

    According to the docs, it should work :)