elasticsearchmatchquotephrasematch-phrase

ElasticSearch - Phrase match on whole document? Not just one specific field


Is there a way I can use elastic match_phrase on an entire document? Not just one specific field.

We want the user to be able to enter a search term with quotes, and do a phrase match anywhere in the document.

{
    "size": 20,
    "from": 0,
    "query": {
        "match_phrase": {
            "my_column_name": "I want to search for this exact phrase"
        }
    }
}

Currently, I have only found phrase matching for specific fields. I must specify the fields to do the phrase matching within.

Our document has hundreds of fields, so I don't think its feasible to manually enter the 600+ fields into every match_phrase query. The resultant JSON would be huge.


Solution

  • You can use a multi-match query with type phrase that runs a match_phrase query on each field and uses the _score from the best field. See phrase and phrase_prefix.

    If no fields are provided, the multi_match query defaults to the index.query.default_field index settings, which in turn defaults to *. This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then combined to build a query.

    Adding a working example with index data, search query and search result

    Index data:

    {
        "name":"John",
        "cost":55,
        "title":"Will Smith"
    }
    {
        "name":"Will Smith",
        "cost":55,
        "title":"book"
    }
    

    Search Query:

    {
      "query": {
        "multi_match": {
          "query": "Will Smith",
          "type": "phrase"
        }
      }
    }
    

    Search Result:

    "hits": [
          {
            "_index": "64519840",
            "_type": "_doc",
            "_id": "1",
            "_score": 1.2199391,
            "_source": {
              "name": "Will Smith",
              "cost": 55,
              "title": "book"
            }
          },
          {
            "_index": "64519840",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.2199391,
            "_source": {
              "name": "John",
              "cost": 55,
              "title": "Will Smith"
            }
          }
        ]