Let's say I have a mongo collection with a text index
on the itemName
field with these 3 documents:
{
_id: ...,
itemName: 'Mashed carrots with big carrot pieces',
price: 1.29
},
{
_id: ...,
itemName: 'Carrot juice',
price: 0.79
},
{
_id: ...,
itemName: 'Apple juice',
price: 1.49
}
I then exec a query like so:
db.items.find({ $text: { $search: 'Car' } }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
How do I force mongo to return documents beginning with "Car" (case insensitive) before returning any other docs also containing "Car" somewhere in the itemName
string?
So I want to retrieve the docs in the following order:
[
{..., itemName: 'Carrot Juice', ...},
{..., itemName: 'Mashed carrots with big carrot pieces', ...}
]
Of course this is meant to be used in a search functionality, so it makes total sense to show the user the items starting with his search string before showing any other items after that.
Until now I was using standard regex, but the performance here is of course much worse! + since I have to search case insensitive, according to the docs, normal regex is not using any indices at all?!
EDIT:
Also, sometimes the behavior of $text
is very weird.
For example I have about 10-15 items where itemName
begins with the word "Zwiebel".
This query
db.items.find({ $text: { $search: "Zwiebel" }, supplier_id: 'iNTJHEf5YgBPicTrJ' }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
works like a charm and returns all those documents, while this query
db.items.find({ $text: { $search: "Zwie" }, supplier_id: 'iNTJHEf5YgBPicTrJ' }, { score: { $meta: "textScore" } }).sort( { score: { $meta: "textScore" } } );
does not return anything! Only by changing "Zwiebel" to "Zwie" in the $search
.
I really do not understand how this is possible?!
best, P
A solution is to use the $indexOfCP
operator introcuced in MongoDB 3.4
This operator return the index of the occurence of a String in an other String, and -1 if there is no occurence
how it works:
/car/gi
(case inensitive)index
which stores the index of 'car' in itemName
index
fieldthe query would looks like this:
db.items.aggregate([
{
$match:{
itemName:/car/gi
}
},
{
$project:{
index:{
$indexOfCP:[
{
$toLower:"$itemName"
},
"car"
]
},
price:1,
itemName:1
}
},
{
$sort:{
index:1
}
}
])
and this returns:
{ "_id" : 2, "itemName" : "Carrot juice", "price" : 0.79, "index" : 0 }
{ "_id" : 1, "itemName" : "Mashed carrots with big carrot pieces", "price" : 1.29, "index" : 7 }
try it online: mongoplayground.net/p/FqqCUQI3D-E
Edit:
For the behavior of the $text
index, this is completely normal
A text index tokenize text using delimiters (default delimiters are white space and punctuation). It can only be used to search for whole worlds, and so it won't work for subpart of words
from mongodb text index documentation
$text will tokenize the search string using whitespace and most punctuation as delimiters, and perform a logical OR of all such tokens in the search string.