elasticsearchelasticsearch-6elasticsearch-queryelasticsearch-nested

Query elasticsearch where a key's value is at least some number


I am processing files to recognize if they contain labels and what the confidence the label was recognized.

I created a nested mapping called tags which contains label (text) and confidence (float between 0 and 100).

Here is an example of how I think the query would work (I know it's invalid). It should be a something like "Find documents that have the tags labelled A and B. A must have a confidence of at least 37 and B must have a confidence of at least 80".

{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "tags.label": "A"
              },
              "range": {
                "tags.confidence": {
                  "gte": 37
                }
              }
            },
            {
              "match": {
                "tags.label": "B"
              },
              "range": {
                "tags.confidence": {
                  "gte": 80
                }
              }
            }
          ]
        }
      }
    }
  }
}

Any ideas? I am pretty sure I need to approach it differently (different mapping). I am not sure how to accomplish this in ElasticSearch. Is this possible?


Solution

  • Let's say your parent document would contain two nested documents, something like below:

    {  
       "tags":[  
          {  
             "label":"A",
             "confidence":40
          },
          {  
             "label":"B",
             "confidence":85
          }
       ]
    }
    

    If that is the case, below is how your query would be:

    Nested Query:

    POST <your_index_name>/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "nested": {
                "path": "tags",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "match": {
                          "tags.label": "A"
                        }
                      },
                      {
                        "range": {
                          "tags.confidence": {
                            "gte": 37
                          }
                        }
                      }
                    ]
                  }
                }
              }
            },
            {
              "nested": {
                "path": "tags",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "match": {
                          "tags.label": "B"
                        }
                      },
                      {
                        "range": {
                          "tags.confidence": {
                            "gte": 80
                          }
                        }
                      }
                    ]
                  }
                }
              }
            }
          ]
        }
      }
    }
    

    Note that each nested document is indexed as a separate document. That is the reason you have to mention two separate queries. Otherwise, with what you have what it does it, it would search all the four values inside one/single nested document of its parent document.

    Hope this helps!