I have written below an example default field from the managed-schema.xml
file. What I observed is that generally people use classes such as solr.LowerCaseFilterFactory
etc., but in the field below, for example, there is a filter called lowercase
without a class. So, is this configuration actively working, or is it just a template?
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"/>
<analyzer type="index"/>
<tokenizer class="standard"/>
<filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter name="lowercase"/>
<filter name="englishPossessive"/>
<filter protected="protwords.txt" name="keywordMarker"/>
<filter name="porterStem"/>
</analyzer>
<analyzer type="query">
<tokenizer class="standard"/>
<filter name="synonymGraph" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
<filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter name="lowercase"/>
<filter name="englishPossessive"/>
<filter protected="protwords.txt" name="keywordMarker"/>
<filter name="porterStem"/>
</analyzer>
</fieldType>
It depends on which version of Solr you're using; later versions are able to look up the class name from the short form (i.e. without the FilterFactory
postfix. See the example in the current reference guide:
<fieldType name="text" class="solr.TextField">
<analyzer>
<tokenizer name="standard"/>
<filter name="lowercase"/>
<filter name="englishPorter"/>
</analyzer>
</fieldType>
Compared to the legacy format shown in the same guide:
<fieldType name="text" class="solr.TextField">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"/>
</analyzer>
</fieldType>
As you can see there's just a lot of repetition in the class names given, so instead of having the complete class name, Solr resolves it based on the common pattern and the type given instead.