elasticsearchelasticsearch-queryelasticsearch-nested

How to query keyword nested field that can be an integer values in ElasticSearch?


This is example of field in ES document. How can i query 'prices.value' that can be only integers?

In this case, value is "150.99" and can fully be converted to integer. But sometimes in can be a text something like "value": "a lot" and i want to exclude document with this values.

 "prices": [
     {
      "currency": "RUR",
      "id_offer": 605994811,
       "id_prcdoc": 42172,
        "id_prcknd": 20859,
         "info": {},
         "min_order": null,
         "sell_by": null,
         "value": "150.99"}]

Index of this field:

"prices": {
        "type": "nested",
        "properties": {
          "currency": {
            "type": "keyword"
          },
          "id_offer": {
            "type": "integer"
          },
          "id_prcdoc": {
            "type": "integer"
          },
          "id_prcknd": {
            "type": "integer"
          },
          "min_order": {
            "type": "keyword"
          },
          "sell_by": {
            "type": "keyword"
          },
          "value": {
            "type": "keyword",
            "index": false
          }
        }
      }

And sometimes it 'value' field can be '0.00' and probably i want to exclude this values to..


Solution

  • You can use painless script to check if value can be converted to number.

    {
      "query": {
        "bool": {
          "must": [
            {
              "nested": {
                "path": "prices",
                "query": {
                  "script": {
                    "script": "if(doc['prices.value'].size()==0){return false;}if(doc['prices.value'].value=='0.00'){return false;}try{ Double.parseDouble(doc['prices.value'].value); return true;} catch(Exception e){return false;} "
                  }
                },
                "inner_hits": {}
              }
            }
          ]
        }
      }
    }
    

    Result will be in inner_hit. Since scripts are slow it is better to resolve it while indexing . Another field can we created which will have value only if it is price value is number, and in query this field can be used

    EDIT:

    {
      "query": {
        "bool": {
          "must": [
            {
              "nested": {
                "path": "prices",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "regexp": {
                          "prices.value": "[1-9][0-9]*.*[0-9]*"
                        }
                      }
                    ]
                  }
                },
                "inner_hits": {}
              }
            }
          ]
        }
      }
    }