elasticsearch

elasticsearch bool query combine must with OR


I am currently trying to migrate a solr-based application to elasticsearch.

I have this lucene query:

(( 
    name:(+foo +bar) 
    OR info:(+foo +bar) 
)) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)

As far as I understand this is a combination of must clauses combined with boolean OR:

Get all documents containing (foo AND bar in name) OR (foo AND bar in info). After that filter results by condition state=1 and boost documents that have an image.

I have been trying to use a bool query with must but I am failing to get boolean OR into must clauses. Here is what I have:

GET /test/object/_search
{
  "from": 0,
  "size": 20,
  "sort": {
    "_score": "desc"
  },
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "foo"
          }
        },
        {
          "match": {
            "name": "bar"
          }
        }
      ],
      "must_not": [],
      "should": [
        {
          "match": {
            "has_image": {
              "query": 1,
              "boost": 100
            }
          }
        }
      ]
    }
  }
}

As you can see, must conditions for info are missing.

** UPDATE **

I have updated my elasticsearch query and got rid of that function score. My base problem still exists.


Solution

  • I finally managed to create a query that does exactly what i wanted to have:

    A filtered nested boolean query. I am not sure why this is not documented. Maybe someone here can tell me?

    Here is the query:

    GET /test/object/_search
    {
      "from": 0,
      "size": 20,
      "sort": {
        "_score": "desc"
      },
      "query": {
        "filtered": {
          "filter": {
            "bool": {
              "must": [
                {
                  "term": {
                    "state": 1
                  }
                }
              ]
            }
          },
          "query": {
            "bool": {
              "should": [
                {
                  "bool": {
                    "must": [
                      {
                        "match": {
                          "name": "foo"
                        }
                      },
                      {
                        "match": {
                          "name": "bar"
                        }
                      }
                    ],
                    "should": [
                      {
                        "match": {
                          "has_image": {
                            "query": 1,
                            "boost": 100
                          }
                        }
                      }
                    ]
                  }
                },
                {
                  "bool": {
                    "must": [
                      {
                        "match": {
                          "info": "foo"
                        }
                      },
                      {
                        "match": {
                          "info": "bar"
                        }
                      }
                    ],
                    "should": [
                      {
                        "match": {
                          "has_image": {
                            "query": 1,
                            "boost": 100
                          }
                        }
                      }
                    ]
                  }
                }
              ],
              "minimum_should_match": 1
            }
          }    
        }
      }
    }
    

    In pseudo-SQL:

    SELECT * FROM /test/object
    WHERE 
        ((name=foo AND name=bar) OR (info=foo AND info=bar))
    AND state=1
    

    Please keep in mind that it depends on your document field analysis and mappings how name=foo is internally handled. This can vary from a fuzzy to strict behavior.

    "minimum_should_match": 1 says, that at least one of the should statements must be true.

    This statements means that whenever there is a document in the resultset that contains has_image:1 it is boosted by factor 100. This changes result ordering.

    "should": [
      {
        "match": {
          "has_image": {
            "query": 1,
            "boost": 100
          }
        }
       }
     ]
    

    Have fun guys :)