typesense

Why is typesense returning only few of the matches in the index?


I have typesense server running and working. However I have found strange behaviour. There are documents in index with string column called "product_number". And there are product numbers in format "BKP001", "BKP002", .... "BKP999". There is 60 such documents.

However, when I query typesense and search for "BKP", it finds only 4 random matching documents. Strange is that if I do more specific search for "BKP01", it returns 4 documents "BKP010", "BKP011", "BKP012" and "BKP013".

And when I search for "BKP03" then it returns 4 documents "BKP031", "BKP032", "BKP033" and "BKP034". So it is clear that all the documents are correctly in index.

What could be the reason why it doesn't find all the documents ?


Solution

  • When there are several possible prefix matches for a particular keyword, Typesense limits the number of prefixes picked for performance reasons, until more of the keyword is typed.

    So for example, if you type "a" typesense will pick the top 10 prefixes that start with "a" and return all results for these top 10 prefixes. And then as users type in more letters, for eg: "acd", it will then pick the top 10 prefixes that start with "acd" and then return all results for those and so on. So as more characters are typed in, the more refined the prefix search gets.

    If a collection has less than 500K documents, Typesense defaults to looking at top 10 prefixes, for a collection beyond that, Typesense defaults to looking at the top 4 prefixes.

    If you want all prefixes to be considered for the search, you want to add max_candidates=10000 as search parameters.

    You also want to make sure that you're using the latest version of Typesense as relevance improvements are regularly released with each version.