elasticsearchlucenehibernate-search

Hibernate Search: Elasticsearch and Lucene yield different search results


I am trying to implement a quite basic search functionality for my REST backend using Spring Data Rest and Hibernate Search. I would like to allow users to execute arbitrary queries by passing query strings to a search function. In order to be able to easier run the backend locally and to avoid having to spin up Elasticsearch to run tests, I would like to be able to work with a local index in these situations.

My problem is that the following code, does not yield equal results using local index compared to Elasticsearch. I am trying to limit the following code to what I believe is relevant.

The entity:

@Indexed(index = "MyEntity")
@AnalyzerDef(name = "ngram",
    tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class ),
    filters = {
      @TokenFilterDef(factory = StandardFilterFactory.class),
      @TokenFilterDef(factory = LowerCaseFilterFactory.class),
      @TokenFilterDef(factory = StopFilterFactory.class),
      @TokenFilterDef(factory = NGramFilterFactory.class,
        params = {
          @Parameter(name = "minGramSize", value = "2"),
          @Parameter(name = "maxGramSize", value = "3") } )
    }
)
public class MyEntity {

    @NotNull
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "ngram"))
    private String name;

    @Field(analyze = Analyze.YES, store = Store.YES)
    @FieldBridge(impl = StringCollectionFieldBridge.class)
    @ElementCollection(fetch = FetchType.EAGER)
    private Set<String> tags = new HashSet<>();

}

application.yml for local index:

spring: 
  jpa:
    hibernate:
      ddl-auto: update
    show-sql: false

application.yml for Elasticsearch:

spring: 
  jpa:
    hibernate:
      ddl-auto: create-drop
    properties:
      hibernate:
        search:
          default:
            indexmanager: elasticsearch
            elasticsearch:
              host: 127.0.0.1:9200
              required_index_status: yellow

Search endpoint:

private static String[] FIELDS = { "name", "tags" };

@Override
public List<MyEntity> querySearch(String queryString) throws ParseException {
    QueryParser queryParser = new MultiFieldQueryParser(FIELDS, new SimpleAnalyzer());
    queryParser.setDefaultOperator(QueryParser.AND_OPERATOR);
    org.apache.lucene.search.Query query = queryParser.parse(queryString);

    FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(this.entityManager);

    javax.persistence.Query persistenceQuery = 
            fullTextEntityManager.createFullTextQuery(query, MyEntity.class);

    return persistenceQuery.getResultList();
}

I create a instance of MyEntity with the following values:

$ curl 'localhost:8086/myentities'
{
  "_embedded" : {
    "myentities" : [ {
      "name" : "Test Entity",
      "tags" : [ "bar", "foobar", "foo" ],
      "_links" : {
        ...
      }
    } ]
  },
  "_links" : {
    ...
  }
}

The following queries work (return that entity) using Elasticsearch:

Using a local index, I get the result for "tags:bar: but the queries on the name field return not results. Any ideas why this is the case?


Solution

  • You should make sure that the Elasticsearch mapping is properly created by Hibernate Search. By default, Hiberante Search will only create a mapping if it is missing.

    If you launched your application once, then changed the mapping, and launched the application again, it is possible that the name field does not have the correct mapping in Elasticsearch.

    In development mode, try this:

    spring: 
      jpa:
        hibernate:
          ddl-auto: create-drop
        properties:
          hibernate.search:
              schema_management.strategy: drop-and-create-and-drop
    

    See https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#mapper-orm-schema-management-strategy

    Note that documents being successfully indexed is unfortunately not an indication that your mapping is correct: Elasticsearch even creates fields dynamically when you try to index unknown fields trying to guess their type (generally wrong, in the case of text fields...). You can use the validate index management strategy to be really sure that, on bootstrap, the Elasticsearch mapping is in sync with Hibernate Search.


    Older answer (Hibernate Search 5):

    In development mode, try this:

    spring: 
      jpa:
        hibernate:
          ddl-auto: create-drop
        properties:
          hibernate:
            search:
              default:
                indexmanager: elasticsearch
                elasticsearch:
                  host: 127.0.0.1:9200
                  required_index_status: yellow
                  index_schema_management_strategy: drop-and-create-and-drop
    

    See https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#elasticsearch-schema-management-strategy