mongodbaggregation-frameworkfull-text-searchatlas

MongoDB - $search throwing an unknown error occurred in mongo Atlas


I am using mongo Atlas to use $search aggregation pipeline stage. I know that the $search aggregation pipeline stage is only available for collections hosted on MongoDB Atlas cluster and I have proper subscription so that's not the problem. I am trying to write a $search stage with autocomplete operator. When I type the word I want to show all the matching values as result.

[
    {
        '$search': {
            'index': 'Custom_Test', 
            'autocomplete': {
                'path': 'SomeFileName', 
                'query': 'Test', 
                'tokenOrder': 'sequential'
            }, 
            'highlight': {
                'path': [
                    'SomeFileName'
                ]
            }
        }
    }, {
        '$limit': 10
    }, {
        '$project': {
            '_id': 1, 
            'SomeFileName': '$SomeFileName', 
            'Ancestors': 1, 
            'highlights': {
                '$meta': 'searchHighlights'
            }
        }
    }
]

So when I am passing Test it should return all the matching SomeFileName fields. I tried to remove index as well because it will pick up default if not mentioned and tried it but no use. Any guidance is appreciated.


Solution

  • I know that the $search aggregation pipeline stage is only available for collections hosted on MongoDB Atlas.

    This is not accurate, atlas search is a separate feature "unrelated" to the base Mongodb product. It is built using the lucene engine similarly to many modern text search engines.

    This means it's actually maintaining a separate inverse index database to allow such text search capabilities. This also means the data needs to be prepared and tokenised accordingly - there is no magic search layer that can achieve such performance without preprocessing.

    You need to create a new collection as a search index, define proper mapping and tokenizing methods and insert the data there for it to be prepared. This is actually quite simple, just follow their docs on how to do this.

    final disclaimer this answer has some inaccuracies as I decided to simplify the explanation.