elasticsearchelasticsearch-rest-client

Elasticsearch query match multiple values to single field


Trying to get documents matching values X1 or Y1 for field ABC. Tried both must or should queries, but not getting expected result. Can someone suggest what kind of query should I try? Using HighLevelRestClient.

{
  "bool" : {
    "must" : [
      {
        "term" : {
          "ABC" : {
            "value" : "X1",
            "boost" : 1.0
          }
        }
      },
      {
        "term" : {
          "ABC" : {
            "value" : "Y1",
            "boost" : 1.0
          }
        }
      }
    ]
  }
}

OR

{
  "bool" : {
    "should" : [
      {
        "term" : {
          "ABC" : {
            "value" : "X1",
            "boost" : 1.0
          }
        }
      },
      {
        "term" : {
          "ABC" : {
            "value" : "Y1",
            "boost" : 1.0
          }
        }
      }
    ]
  }
}

Mapping


{
  "mappings": {
    "properties": {
      "ABC": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 2
          }
        }
      },


mustNot condition works fine. If I just reverse the condition and ignore fields values then I get the result.

X1 and Y1 are exact field values (Think enums)

BoolQueryBuilder x = QueryBuilders.boolQuery();
for (SomeEnum enum : enums) { 
   x.should(QueryBuilders.termQuery("ABC",enum.name());
}

Still query returns all documents. This should have filtered the documents to matching values

Sample Doc


{
  "_index": "some_index",
  "_type": "_doc",
  "_id": "uyeuyeuryweoyqweo",
  "_score": 1.0,
  "_source": {
    "A": true
    "ABC": "X1"
    "WS": "E"
  }
}, 
{
  "_index" : "some_index",
  "_type" : "_doc",
  "_id" : "uyeuyeuryweoyqweo1",
  "_score" : 1.0,
  "_source" : {
    "A" : true,
    "ABC" : "Y1",
    "WS" : "MMM"
  }
}



Solution

  • As you have not provided your mapping, possible cause is due to the mismatch of search time tokens to index-tokens.

    As you are using term query which are not analyzed as mentioned in doc

    Returns documents that contain an exact term in a provided field.

    It means that you documents in index must conatins the exact tokens as X1 and Y1 and if these fields are text fields and you have not defined any anlyzer than elasticsearch uses standard analyzer which lowercases the tokens, hence in index x1 and y1 would be stored and nothing will match.

    EDIT : As suspected, issue was due to term query used on text fields, below query would give the expected results

    {
      "bool" : {
        "should" : [
          {
            "term" : {
              "ABC.keyword" : {
                "value" : "X1",
                "boost" : 1.0
              }
            }
          },
          {
            "term" : {
              "ABC.keyword" : {
                "value" : "Y1",
                "boost" : 1.0
              }
            }
          }
        ]
      }
    }