javahibernatefuzzy-searchhibernate-search-6

How can I get the highlights of my result set in Hibernate search 6?


I am using Hibernate search 6 Lucne backend in my java application.

There are various search operations I am performing including a fuzzy search.

I get search results without any issues.

Now I want to show what are the causes to pick each result in my result list.

Let's say the search keyword is "test", and the fuzzy search is performed in the fields "name", "description", "Id" etc. And I get 10 results in a List. Now I want to highlight the values in the fields of each result which caused that result to be a matching result.

eg: Consider the below to be one of the items in the search result List object. (for clarity I have written it in JSON format)

 {  
    name:"ABC some test name",
    description: "this is a test element",
    id: "abc123"
}

As the result suggests it's been picked as a search result because the keyword "test" is there in both the fields "name" and the "description". I want to highlight those specific fields in the frontend when I show the search results.

Currently, I am retrieving search results through a java REST API to my Angular frontend. How can I get those specific fields and their values using Hibernate search 6 in my java application?

So far I have gone through Hibernate search 6 documentation and found nothing. (https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#preface) Also looked at what seemed to be related issues on the web over the past week and got nothing so far. It seems like m requirement is a little specific and that's why I need your help here.


Solution

  • Highlighting is not yet implemented in Hibernate Search, see HSEARCH-2192. => Starting with Hibernate Search 6.2, it is! See https://hibernate.org/search/releases/6.2/#search-highlighting, https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#search-dsl-highlighting


    Old answer:

    That being said, you can leverage native Elasticsearch / Lucene APIs.

    With Elasticsearch it's relatively easy: you can use a request transformer to add a highlight element to the HTTP request, then use the jsonHit projection to retrieve the JSON for each hit, which contains a highlight element that includes the highlighted fields and the highlighted fragments.

    With Lucene it would be more complex and you'll have to rely on unsupported features, but that's doable.

    Retrieve the Lucene Query from your Hibernate Search predicate:

    SearchPredicate predicate = ...;
    Query query = LuceneMigrationUtils.toLuceneQuery(predicate);
    

    Then do the highlighting: Hibernate search highlighting not analyzed fields may help with that, so that code uses an older version of Lucene and you might have to adapt it:

    String highlightText(Query query, Analyzer analyzer, String fieldName, String text) {
        QueryScorer queryScorer = new QueryScorer(query);
        SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<span>", "</span>");
        Highlighter highlighter = new Highlighter(formatter, queryScorer);
        return highlighter.getBestFragment(analyzer, fieldName, text);
    }
    

    You'll need to add a depdency to org.apache.lucene:lucene-highlighter.

    To retrieve the analyzer, use the Hibernate Search metadata: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#backend-lucene-access-analyzers

    So, connecting the dots... something like that?

    Highlighter createHighlighter(SearchPredicate predicate, SearchScope<?> scope) {
        // Taking a shortcut here to retrieve the index manager,
        // since we already have the scope
        // WARNING: This only works when searching a single index
        Analyzer analyzer = scope.includedTypes().iterator().next().indexManager()
                .unwrap( LuceneIndexManager.class )
                .searchAnalyzer(); 
    
        // WARNING: this method is not supported and might disappear in future versions of HSearch
        Query query = LuceneMigrationUtils.toLuceneQuery(predicate);
        QueryScorer queryScorer = new QueryScorer(query);
        SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<span>", "</span>");
        return new Highlighter(formatter, queryScorer);
    }
    
    SearchSession searchSession = Search.session( entityManager ); 
    
    SearchScope<Book> scope = searchSession.scope( Book.class ); 
    SearchPredicate predicate = scope.predicate().match() 
                    .fields( "title", "authors.name" )
                    .matching( "refactoring" )
                    .toPredicate();
    
    Highlighter highlighter = createHighlighter(predicate, scope);
    
    // Using Pair from Apache Commons, but others would work just as well
    List<Pair<Book, String>> hits = searchSession.search( scope )
            .select( select( f -> f.composite(
                    // Highlighting the title only, but you can do the same for other fields
                    book -> Pair.of( book, highlighter.getBestFragment(analyzer, "title", book.getTitle()))
                    f.entity()
            ) )
            .where( predicate )
            .fetch( 20 );
    

    Not sure this compiles, but that should get you started.


    Relatedly, but not exactly what you're asking for, there's an explain feature to get a sense of why a given hit has a given score: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#search-dsl-query-explain