I am trying to implement autocomplete inspired by the Search analyzer section in this Hibernate Search 6.0.0.Beta2 release
This is the example from the above link that I am trying to follow.
@Entity
@Indexed
public class Book {
@Id
private Long id;
@FullTextField(
name = "title_autocomplete",
analyzer = "autocomplete",
searchAnalyzer = "autocomplete_query"
)
private String title;
// ... getters and setters ...
}
To define an analyzer named "autocomplete" and a search analyzer named "autocomplete_query", I followed the 10.6.4 Custom analyzers and normalizers and defined the the following custom lucene analysis configurer and create a new persistence.xml.
public class CustomLuceneAnalysisConfigurer implements LuceneAnalysisConfigurer {
@Override
public void configure(LuceneAnalysisConfigurationContext context) {
context.analyzer("autocomplete").custom()
.tokenizer(StandardTokenizerFactory.class)
.charFilter(HTMLStripCharFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.param("language", "English")
.tokenFilter( ASCIIFoldingFilterFactory.class)
.tokenFilter(EdgeNGramFilterFactory.class);
context.analyzer("autocomplete_query").custom()
.tokenizer(StandardTokenizerFactory.class)
.charFilter(HTMLStripCharFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.param("language", "English")
.tokenFilter(ASCIIFoldingFilterFactory.class);
}
}
<property name="hibernate.search.backend.analysis.configurer"
value="class:net.ad.mc.lucene_search.CustomLuceneAnalysisConfigurer"/>
My question is : is there a way to set the minGramSize and maxGramSize using the above method? I've gone through the official documentation but found no information on how to do this.
This can be done similarly to how you have the language
parameter specified for lower case filter.
tokenFilter()
returns a DSL step exposing a parameter method through which you can pass any filter-related parameters:
public class CustomLuceneAnalysisConfigurer implements LuceneAnalysisConfigurer {
@Override
public void configure(LuceneAnalysisConfigurationContext context) {
context.analyzer("autocomplete").custom()
.tokenizer(StandardTokenizerFactory.class)
.charFilter(HTMLStripCharFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.param("language", "English")
.tokenFilter( ASCIIFoldingFilterFactory.class)
.tokenFilter( EdgeNGramFilterFactory.class )
.param( "minGramSize", "3" )
.param( "maxGramSize", "7" );
context.analyzer("autocomplete_query").custom()
.tokenizer(StandardTokenizerFactory.class)
.charFilter(HTMLStripCharFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.param("language", "English")
.tokenFilter(ASCIIFoldingFilterFactory.class);
}
}
In case you are unsure about parameter name strings - open a filter class implementation and look for a constructor accepting a map - it will have the parameter names in it.