I've got an item that says "4k display" and when I search for "4k display" that item does not seem to be prioritized and other items with "display" (without 4k) come up.
If I search for "4k" nothing shows up.
What in the config should I change to remedy this?
Update: This is how the text type part looks like, likely setup by the sunspot gem.
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<!--<filter class="solr.StandardFilterFactory"/>-->
<filter class="solr.LowerCaseFilterFactory"/>
<!--<filter class="solr.KStemFilterFactory"/>-->
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="7"/>
</analyzer>
</fieldType>
The minGram size looks like the culrpit?
So lets walk through your analysis chain. First comes Standard Tokenizer. It will split on whitespaces. So "4K display" will split into two tokens
4k,display
Next one is lowercaseFilter. which will lower case the tokens so in this case nothing will change as its already lowercased. So by end of this step you still have the same two tokens
4k,display
Now comes the NGramFilterFactory which will start creating tokens like this. so e.g if you have a token called "abcd"
Ngram will produce tokens like this.
a,ab,abc,abcd,b, bc,bcd,c,cd,d
But there is another option defined in the ngram field type
minGramSize="3" maxGramSize="7"
Which means that only retain the tokens which have min lenght of 3 and max of 7. so in the above example you will only see
abc,abcd,bcd
So far with me.
Now lets apply it to your case. After lowercase filter we had two tokens
4k,display
Applying Ngram on both will produce following
4,4k,k,d,di,dis,disp,displ,displa,display,i,isp and so on . You get the idea.
But since miggram size is 3. 4 and 4k will be dropped from your index. Hence you are not able to search using 4k. Because it was never in the index.
your index only has value starting with dis
like
dis,disp,displ,displa,display
In order to fix this. First you need to understand how you want to search your data.
Do you really need NGRamtokenizer ?
e.g IF you just want to get exact matches. e.g when you query "4k display", you want only results which has either "4k" or "display" or "4k display" then you need to change the your analysis chain.
Comment out the NGram from your analyis chain in that case and reindex and try querying again.