I have a test
collection with these two documents:
{ _id: ObjectId("636ce11889a00c51cac27779"), sku: 'kw-lids-0009' }
{ _id: ObjectId("636ce14b89a00c51cac2777a"), sku: 'kw-fs66-gre' }
I've created a search index with this definition:
{
"analyzer": "lucene.standard",
"searchAnalyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"sku": {
"type": "string"
}
}
}
}
If I run this aggregation:
[{
$search: {
index: 'test',
text: {
query: 'kw-fs',
path: 'sku'
}
}
}]
Why do I get 2 results? I only expected the one with sku: 'kw-fs66-gre'
😬
During indexing, the standard anlyzer breaks the string "kw-lids-0009" into 3 tokens [kw][lids][0009]
, and similarly tokenizes "kw-fs66-gre" as [kw][fs66][gre]
. When you query for "kw-fs", the same analyzer tokenizes the query as [kw][fs]
, and so Lucene matches on both documents, as both have the [kw]
token in the index.
To get the behavior you're looking for, you should index the sku
field as type autocomplete
and use the autocomplete
operator in your $search
stage instead of text