I have integrated Solr with My eComemrce web application. I am indexing product title and many other fields of Product to Solr. Now I have indexed BLÅBÆRSOMMEREN into product title/name. I have added EdgeNGram as well for Title field. Because of EdgeNGram if I search any of the token I got the result. And Because of spell check if I Search for wrong spell like: BLÅBÆRISOMMEREN, I got the result. But if I search for BLÅBÆRI, I did not get any result as there is not any token for the same.
I want the products in result which have BLÅBÆR because that token is exist. Same for any other wrong spell search.
How can I achieve this? Any help will be appreciated!
Thanks.
It sounds like you may have Solr's tokenization configured differently for indexing and querying.
So, in your example the following terms may appear in the index:
However as your query terms are not being processed into ngrams, you are only searching for
which does not appear within your indexed terms.
This is a common practice when using ngrams, however it sounds like in your use-case you want to return partial matches within your results.
Check your Solr schema to make sure that you have a matching EdgeNGram filter configured for query-time as you do for index-time, e.g.
<fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
</fieldType>
Make sure you're sorting by score
though, as this strategy will most likely give you many false-positives!