elasticsearchelasticsearch-analyzers

Elasticsearch not using synonyms analyzer on query_string query


I'm trying to create an elasticsearch index that uses a custom synonym analyzer. However, it's not working how I expect, and therefore, is not returning any results.

I create an index as follows:

{
    "settings": {
        "analysis": {
            "filter": {
                "english_keywords": {
                    "type": "keyword_marker",
                    "keywords": [
                        "microsoft"
                    ] // stem_exclusion
                },
                "synonym": {
                    "type": "synonym",
                    "synonyms": [
                        "carcinoma => cancer"
                    ]
                }
            },
            "analyzer": {
                "custom_english_analyzer": {
                    "tokenizer": "standard",
                    "filter": [
                        "apostrophe",
                        "asciifolding",
                        "lowercase",
                        "synonym",
                        "stop",
                        "english_keywords",
                        "kstem"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "my_id": {
                "type": "keyword"
            },
            "titles": {
                "type": "keyword",
                "fields": {
                    "fulltext": {
                        "type": "text",
                        "analyzer": "custom_english_analyzer"
                    }
                }
            }
        }
    }
}

Then I index a record:

{
  "my_id": "HF124",
  "titles": [
    "skin cancer"
  ]
}

If if run a test on the analyzer using /test_index/_analyze and

{
  "analyzer" : "custom_english_analyzer",
  "text" : "carcinoma"
}

I get back the following:

{
    "tokens": [
        {
            "token": "cancer",
            "start_offset": 0,
            "end_offset": 9,
            "type": "SYNONYM",
            "position": 0
        }
    ]
}

However, when I run the following query, I don't get any records returned. I would expect 'carcinoma' to be replaced with 'cancer' and match. What did I miss?

{
    "explain": true,
    "track_scores": true,
    "track_total_hits": true,
    "query": {
        "bool": {
            "must": [
                {
                    "query_string": {
                        "query": "carcinoma*",
                        "fields": [
                            "titles.fulltext"
                        ]
                    }
                }
            ]
        }
    }
}

Solution

  • You need to remove the wildcard (*) from the query part. Replace carcinoma* with carcinoma.

    Modify your search query as

    {
      "explain": true,
      "track_scores": true,
      "track_total_hits": true,
      "query": {
        "bool": {
          "must": [
            {
              "query_string": {
                "query": "carcinoma",       // note this
                "fields": [
                  "titles.fulltext"
                ]
              }
            }
          ]
        }
      }
    }