elasticsearchspring-data-elasticsearch

The query results obtained using elasticsearchOperation differ from those obtained using curl


I am currently attempting to query certain indexes using spring-data-elasticsearch's elasticSearchOperations, employing the following query:

        val index = IndexCoordinates.of("test-*-en")

        val searchQuery = NativeSearchQueryBuilder()
            .withFilter(
                QueryBuilders.boolQuery()
                    .must(QueryBuilders.matchQuery("english","test query : 12345678"))
            ).withMaxResults(30)
            .build()

        return elasticsearchOperations.search(searchQuery, TestEsDocument::class.java, index)

However, I've observed a discrepancy in the query results obtained compared to those obtained via curl, despite employing the exact same query. I suspect this variance arises because the document with a higher score wasn't prioritized during the search when utilizing esSearchOperations.

I've included the curl query for reference:

POST test-*-en/_search?size=30
{
  "query": 
    {
  "bool" : {
    "must" : [
      {
        "match" : {
          "english" : {
            "query" : "test query : 12345678",
            "operator" : "OR",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : false,
            "zero_terms_query" : "NONE",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      }
    ],
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}
}

Though uncertain, I suspect that the issue may lie in elasticsearchOperation not accurately reading the scores from elasticSearch, as shown in the provided screenshots. elasticSearchOperation's SearchResult

Unfortunately, I am constrained to using es version 7.10.1-oss and cannot upgrade so I am using spring-data-elasticsearch:4.2.12, internally setting elasticsearch.version to 7.10.1 using the following options:

extra["elasticsearch.version"] = "7.10.1"

Could you advise if this issue arises from the version mismatch between spring-data-elasticsearch and my Elasticsearch cluster, and whether it's possible to resolve this issue? Alternatively, should I consider altering my query request to utilize a custom-built WebClient Post request instead?

Many thanks in advance!


Solution

  • Spring Data Elasticsearch returns 30 of the top scored results (the max score is 1.0) and you get 30 of them, you did not specify a sort order. The ones returned by curl, do they have the same score?

    Edit:

    I just saw that in your java code you add the query as a filter. So you will first get the result of getting all data without any scoring and then pass them through the filter. Add the query as query and not as filter.