elasticsearchsearchkeyword-search

Elasticsearch 7 number_format_exception for input value as a String


I have field in index with mapping as :

 "sequence_number" : {
          "type" : "long",
          "copy_to" : [
            "_custom_all"
          ]
        }

and using search query as

POST /my_index/_search
{
  "query": {
    "term": {
      "sequence_number": {
        "value": "we"
      }
    }
  }
}

I am getting error message :

,"index_uuid":"FTAW8qoYTPeTj-cbC5iTRw","index":"my_index","caused_by":{"type":"number_format_exception","reason":"For input string: \"we\""}}}]},"status":400}
        at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:260) ~[elasticsearch-rest-client-7.1.1.jar:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:238) ~[elasticsearch-rest-client-7.1.1.jar:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar:7.1.1]
        at

How can i ignore number_format_exception errors, so the query just doesn't return anything or ignores this filter in particular - either is acceptable.

Thanks in advance.


Solution

  • What you are looking for is not possible, ideally, you should have coherce enabled on your numeric fields so that your index doesn't contain dirty data.

    The best solution is that in your application which generated the Elasticsearch query(you should have a check for NumberFormatExcepton if you are searching for numeric fields as your index doesn't contain the dirty data in the first place and reject the query if you get an exception in your application).

    Edit: Another interesting approach is to validate the data before inserting into ES, using the Validate API as suggested by @prakash, only thing is that it would add another network call but if your application is not latency-sensitive, it can be used as a workaround.