solrwebsolr

Solr Search with wrong spell


I have integrated Solr with My eComemrce web application. I am indexing product title and many other fields of Product to Solr. Now I have indexed BLÅBÆRSOMMEREN into product title/name. I have added EdgeNGram as well for Title field. Because of EdgeNGram if I search any of the token I got the result. And Because of spell check if I Search for wrong spell like: BLÅBÆRISOMMEREN, I got the result. But if I search for BLÅBÆRI, I did not get any result as there is not any token for the same.

I want the products in result which have BLÅBÆR because that token is exist. Same for any other wrong spell search.

How can I achieve this? Any help will be appreciated!

Thanks.


Solution

  • It sounds like you may have Solr's tokenization configured differently for indexing and querying.

    So, in your example the following terms may appear in the index:

    However as your query terms are not being processed into ngrams, you are only searching for

    which does not appear within your indexed terms.

    This is a common practice when using ngrams, however it sounds like in your use-case you want to return partial matches within your results.

    Check your Solr schema to make sure that you have a matching EdgeNGram filter configured for query-time as you do for index-time, e.g.

    <fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100">
       <analyzer type="index">
          <tokenizer class="solr.LowerCaseTokenizerFactory"/>
          <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
       </analyzer>
       <analyzer type="query">
          <tokenizer class="solr.LowerCaseTokenizerFactory"/>
          <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
       </analyzer>
    </fieldType>
    

    Make sure you're sorting by score though, as this strategy will most likely give you many false-positives!