we have a large synonym list. I use a manual analyzer to index the search field. The synonym list is annotated with the "SynonymGraphFilterFactory" filter. So far everything is good. When I do a search on the field, I get the matching result. Synonym list looks like this: car, vehicle
If I enter "car" in my search, the correct results are displayed and the word "car" is highlighted.
When I enter the word "vehicle" I get correct results but nothing is highlighted.
I would like to have both words highlighted in the search. "car" and "vehicle". Is that even possible?
So far I haven't found a suitable solution. Maybe someone can help me here.
Configurations: Hibernate-search 6, Lucene Higlighter 8.7
Code:
To index the search field, my analyzer looks like this:
context.analyzer ("myCustomAnalyzer"). custom ()
.tokenizer (StandardTokenizerFactory.class)
.tokenFilter (LowerCaseFilterFactory.class)
.tokenFilter (KeywordRepeatFilterFactory.class)
.tokenFilter (PorterStemFilterFactory.class)
.tokenFilter (TrimFilterFactory.class)
.tokenFilter (SnowballPorterFilterFactory.class) .param ("language", "German")
.tokenFilter (RemoveDuplicatesTokenFilterFactory.class)
.tokenFilter (SynonymGraphFilterFactory.class) .param ("synonyms", "synonyms / synonyms.properties")
.param ("ignoreCase", "true"). param ("expand", "true");
Highlighter method looks like this:
private Results highlighting(final Results results, final String mySearchString) {
final SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter("start", "end");
final TermQuery query = new TermQuery(
new Term("indexFieldName", mySearchString));
final QueryScorer queryScorer = new QueryScorer(query, "indexFieldName");
final Fragmenter fragmenter = new SimpleSpanFragmenter(queryScorer);
queryScorer.setExpandMultiTermQuery(true);
final Highlighter highlighter = new Highlighter(simpleHTMLFormatter, queryScorer);
highlighter.setTextFragmenter(fragmenter);
try (Analyzer analyzer = new StandardAnalyzer()) {
for (final MyEntity my : results.getMyResults()) {
for (final MySecondEntity sec : my.getMyDescriptions()) {
final String text = sec.getMyName();
try {
final TokenStream tokenStream = analyzer.tokenStream(
"indexFieldName", new StringReader(text));
final String result = highlighter.getBestFragments(
tokenStream, text,
sec.getMyName().length(), " ...");
if (!StringUtils.isBlank(result)) {
sec.setMyName(result);
}
} catch (final Exception e) {
LOG.warn(String.format(
"Failure during highlighting process for ..."...
}
}
}
}
return results;
}
Thank you for your answers
I'm not overly familiar with highlighters, but one thing that seems suspicious in your code is the fact that you're using a StandardAnalyzer
to highlight. If you want synonyms to be highlighted, I believe you need to use an analyzer that handles synonyms.
Try using the same analyzer for indexing and highlighting.
You can retrieve the analyzer instance from Hibernate Search. See this section of the documentation, or this example:
LuceneBackend luceneBackend =
Search.mapping( entityManager.getEntityManagerFactory() )
.backend().unwrap( LuceneBackend.class );
Analyzer analyzer = luceneBackend.analyzer( "myCustomAnalyzer" ).get();
Then use it instead of new StandardAnalyzer()
in your highlighting code; just make sure you don't close this analyzer.