I have used the new aggregation functionality of Hibernate Search 6 to develop a classic "faceted search" interface, in which the various search fields in the UI are accompanied by the most popular choices taken from the aggregation data of the SearchResult
. This works beautifully.
However, I would like to allow the users to be able to search these fields case-insensitively, so that they are not limited to choosing from the aggregation results and are not penalised for typing in the wrong case.
I have applied a lowercase normalizer to the aggregable field definition, which allows me to search case-insensitively, but if I do this all of the aggregation data retrieved from the SearchResult
, and presented to the user, is also in lowercase.
Is there a way to allow case-insensitive searches while retaining the original case in the aggregation results?
I have attempted to use projectable( Projectable.YES )
in my field definition, in the hope that this would return the original case, but it had no effect.
My current field template definition is:
indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
.aggregable( Aggregable.YES )
.projectable( Projectable.YES )
.normalizer( "lowercase")
).multiValued();
and my lowercase normalizer is defined as:
luceneAnalysisConfigurationContext.normalizer( "lowercase" )
.custom()
.tokenFilter( "lowercase" );
I'm using the Lucene backend.
Ideally you'd use multi-fields, but that's not available at the moment (https://hibernate.atlassian.net/browse/HSEARCH-3465).
In the meantime, I would rely on two separate fields:
// Declare one field for aggregations
// (do this first, so that the glob is matched first)
indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
.aggregable( Aggregable.YES )
)
.matchingPathGlob("*_agg")
.multiValued();
// Declare one field for search
indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
.normalizer( "lowercase")
).multiValued();
Then in your bridge, you would duplicate the value: first populate the field "<fieldname>"
, then the field "<fieldname>_agg"
with the same value.
Finally, when searching, you would use "<fieldname>"
for predicates, but "<fieldname>_agg"
for aggregations.