spring-bootelasticsearchlucenehibernate-searchhibernate-5.x

How to match numeric and boolean values in a lucene query


I am using hibernate search to construct a lucene query that returns string values that contain (part of) the search string. Next to that the query must only return the string values if the language id matches as well and if the deleted flag isn't set to true. I've made the below code for this. But the problem is that it doesn't return anything.

private Query getQueryWithBooleanClauses(Class entityClass, String searchString, Long preferredLanguageId, FullTextEntityManager fullTextEntityManager, String firstField, String... additionalFields) {
    QueryBuilder queryBuilder = getQueryBuilder(entityClass, fullTextEntityManager);
    Query containsSearchString = getMatchingStringCondition(searchString, queryBuilder, firstField, additionalFields);
    BooleanQuery isPreferredOrDefaultLanguageTranslation = getLanguageCondition(preferredLanguageId);
    BooleanQuery finalQuery = new BooleanQuery.Builder()
            .add(new TermQuery(new Term("parentDeleted", "false")), BooleanClause.Occur.MUST)
            .add(new TermQuery(new Term("parentApproved", "true")), BooleanClause.Occur.MUST)
            .add(new TermQuery(new Term("childDeleted", "false")), BooleanClause.Occur.MUST)
            .add(isPreferredOrDefaultLanguageTranslation, BooleanClause.Occur.MUST)
            .add(containsSearchString, BooleanClause.Occur.MUST)
            .build();
    return finalQuery;
}

getMatchingStringCondition

private Query getMatchingStringCondition(String searchString, QueryBuilder queryBuilder, String firstField, String... additionalFields) {
    log.info(MessageFormat.format("{0}*", searchString));
    return queryBuilder.simpleQueryString()
            .onFields(firstField, additionalFields)
            .withAndAsDefaultOperator()
            .matching(MessageFormat.format("{0}*", searchString))
            .createQuery();
}

getLanguageCondition

private BooleanQuery getLanguageCondition(Long preferredLanguageId) {
    return new BooleanQuery.Builder()
            .add(createLanguagePredicate(preferredLanguageId), BooleanClause.Occur.SHOULD)
            .add(createLanguagePredicate(languageService.getDefaultLanguage().getId()), BooleanClause.Occur.SHOULD)
            .build();
}

createLanguagePredicate

private Query createLanguagePredicate(Long languageId){
    return new TermQuery(new Term("language.languageId", languageId.toString()));
}

Query executing method

public List<AutoCompleteSuggestion> findAllBySearchStringAndDeletedIsFalse(Class entityClass, String searchString, Long preferredLanguageId){
    FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
    Query finalQuery = getQueryWithBooleanClauses(entityClass, searchString, preferredLanguageId, fullTextEntityManager, "parent.latinName", "translatedName");
    FullTextQuery fullTextQuery = fullTextEntityManager.createFullTextQuery(finalQuery, entityClass);
    fullTextQuery.setProjection("parentId", "autoCompleteSuggestion", "childApproved"); //volgorde moet overeen komen met argumenten volgorde in AutoCompleteSuggestion constructor, zie convertToAutoCompleteSuggestionList
    fullTextQuery.setMaxResults(maxResults);
    fullTextQuery.getResultList();
return convertToAutoCompleteSuggestionList(fullTextQuery.getResultList());
}

This code doesn't throw an error but never returns anything either. Only when i remove all the boolean conditions for the boolean and numerical fields, leaving only the containsSearchString condition will the query return anything.

According to this post Hibernate Search 5.0 Numeric Lucene Query HSEARCH000233 issue this happens because as of Hibernate search 5 numerical fields are no longer treated as text fields and you can't perform matching queries on numerical fields.

You can force that the fields are treated as textfields by annotating them with @FieldBridge. But i'd rather not do that. So my question is. How do i perform match queries on non-text fields like booleans, dates, and numbers?

EDIT: It works if i annotate all the fields required for filtering with @FieldBridge(impl= implementation.class)`,also the index parameter must always be set to YES.

But now all these fields will be stored as strings, which is undesirable. So i'd still like to know if there is another more elegant way to apply filters.

EDIT 2:

@yrodiere, When i removed @FieldBridge(impl = LongBridge.class) from languageId and replace the line .add(isPreferredOrDefaultLanguageTranslation, BooleanClause.Occur.MUST) with:

.add(queryBuilder.bool().must(queryBuilder.keyword().onField("language.languageId").matching(languageService.getDefaultLanguage().getId().toString()).createQuery()).createQuery(), BooleanClause.Occur.MUST)

I get the error:

org.hibernate.search.exception.SearchException: HSEARCH000238: Cannot create numeric range query for field 'language.languageId', since values are not numeric (Date, int, long, short or double)

However just now i discovered that matching() also accepts a Long number so i don't have to call toString() on it. When matching() uses the Long value i don't get an error but nothing is returned either.

Only when i used new TermQuery(new Term("language.languageId", languageId.toString())) instead of matching() while also using a LongBridge for languageId will anything get returned. Am i defining the matching() query erroneously?

I also have a different question that i wanted to start a new SO question for. But maybe you can answer that question in this thread as well :). The question is about the includeEmbeddedObjectId parameter of @IndexedEmbedded. I think i know what this does but i would like to have some confirmation from you.

I assume that when i set this to true the id of the parent entity will be included in the lucene document of the child entity, correct? Lets say that this parent entity is used in a matching() query thats used as a true/false condition. Is it then correct to assume that the search will be faster because the id can now also be found in the lucene document of the child entity?

Thanks


Solution

  • Booleans are still indexed as strings in Hibernate Search 5. See org.hibernate.search.bridge.builtin.BooleanBridge. So boolean fields are not part of the problem here.

    If you really want to create numeric queries yourself, in Hibernate Search 5 you will have to use numeric range queries, e.g.:

    private Query createLanguagePredicate(Long languageId){
        return org.apache.lucene.search.NumericRangeQuery.newLongRange("language.languageId", languageId, 
    languageId, true, true);
    }
    

    That being said, to avoid that kind of problems, you should use the Hibernate Search DSL. Then you'll pass values of the type you use in your model (here, a Long), and Hibernate Search will create the right query automatically.

    Or even better, upgrade to Hibernate Search 6, which exposes a different API, but less verbose and with fewer quirks. See for yourself in the documentation of the Search DSL in Hibernate Search 6, in particular the predicate DSL.