databaseelasticsearchelasticsearch-5elasticsearch-nested

Multiple match_phrase conditions with another bool in a single ElasticSearch query?


I am trying to conduct an Elasticsearch query that searched a text field ("body") and returns items that match at least one of two multi-word phrases I provide (ie: "stack overflow" OR "the stackoverflow"). I would also like the query to only provide results that occur after a given timestamp, with the results ordered by time.

My current solution is below. I believe the MUST is working correctly (gte a timestamp), but the BOOL + SHOULD with two match_phrases is not correct. I am getting the following error:

Unexpected character ('{' (code 123)): was expecting double-quote to start field name

Which I think is because I have two match_phrases in there?

This is the ES mapping and the details of the ES API I am using details are here.

{"query":
  {"bool":
    {"should":
      [{"match_phrase":
         {"body":"a+phrase"}
       },
       {"match_phrase":
         {"body":"another+phrase"}
       }
      ]
    },
  {"bool":
    {"must":
      [{"range":
        {"created_at:
          {"gte":"thispage"}
        }
       }
      ]}
     }
    },"size":10000,
      "sort":"created_at"
}

Solution

  • I think you were just missing a single " after created_at.

    {
        "query": {
            "bool": {
                "must": [
                    {
                        "range": {
                            "created_at": {
                                "gte": "1534004694"
                            }
                        }
                    },
                    {
                        "bool": {
                            "should": [
                                {
                                    "match_phrase": {
                                        "body": "a+phrase"
                                    }
                                },
                                {
                                    "match_phrase": {
                                        "body": "another+phrase"
                                    }
                                }
                            ]
                        }
                    }
                ]
            }
        },
        "size": 10,
        "sort": "created_at"
    }
    

    Also, you are allowed to have both must and should as properties of a bool object, so this is also worth trying.

    {
        "query": {
            "bool": {
                "must": {
                    "range": {
                        "created_at": {
                            "gte": "1534004694"
                        }
                    }
                },
                "should": [
                    {
                        "match_phrase": {
                            "body": "a+phrase"
                        }
                    },
                    {
                        "match_phrase": {
                            "body": "another+phrase"
                        }
                    }
                ]
            }
        },
        "size": 10,
        "sort": "created_at"
    }
    

    On a side note, Postman or any JSON formatter/validator would really help in determining where the error is.