elasticsearchelasticsearch-queryterm-query

Elasticsearch should has different scores


I am retrieving documents by filtering and using a bool query to apply a score. For example:

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "color": "Yellow"
          }
        },
        {
          "term": {
            "color": "Red"
          }
        },

        {
          "term": {
            "color": "Blue"
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

If data has only "Yellow" it gives me a score of "1.5" but if data has only "Red" it gives me a score of "1.4". And I wanted the score to be the same. Each data has only 1 match so why the scores are different? There is anything to ignore the order of terms in should query? When I have only 1 match, the "Yellow" one will be always with a high score...

UPDATE: The issue is not in order of terms in should array but in "number of documents containing the term"


Solution

  • You can use the filter clause along with the bool/should clause, if the scoring is not important for you

    The filter context avoids the scoring part and is a normal yes/no query. So the score will always be 0.0 for the matched documents

    {
      "query": {
        "bool": {
          "filter": {
            "bool": {
              "should": [
                {
                  "term": {
                    "color.keyword": "Yellow"
                  }
                },
                {
                  "term": {
                    "color.keyword": "Black"
                  }
                },
                {
                  "term": {
                    "color.keyword": "Purple"
                  }
                }
              ],
              "minimum_should_match": 1
            }
          }
        }
      }
    } 
    

    The score of the matched documents depends on several factors like length of the field, frequency of term, the total number of documents, etc.

    You can know more about how score is calculated by using explain API

    GET /_search?explain=true