I'm trying to create an elasticsearch index that uses a custom synonym analyzer. However, it's not working how I expect, and therefore, is not returning any results.
I create an index as follows:
{
"settings": {
"analysis": {
"filter": {
"english_keywords": {
"type": "keyword_marker",
"keywords": [
"microsoft"
] // stem_exclusion
},
"synonym": {
"type": "synonym",
"synonyms": [
"carcinoma => cancer"
]
}
},
"analyzer": {
"custom_english_analyzer": {
"tokenizer": "standard",
"filter": [
"apostrophe",
"asciifolding",
"lowercase",
"synonym",
"stop",
"english_keywords",
"kstem"
]
}
}
}
},
"mappings": {
"properties": {
"my_id": {
"type": "keyword"
},
"titles": {
"type": "keyword",
"fields": {
"fulltext": {
"type": "text",
"analyzer": "custom_english_analyzer"
}
}
}
}
}
}
Then I index a record:
{
"my_id": "HF124",
"titles": [
"skin cancer"
]
}
If if run a test on the analyzer using /test_index/_analyze
and
{
"analyzer" : "custom_english_analyzer",
"text" : "carcinoma"
}
I get back the following:
{
"tokens": [
{
"token": "cancer",
"start_offset": 0,
"end_offset": 9,
"type": "SYNONYM",
"position": 0
}
]
}
However, when I run the following query, I don't get any records returned. I would expect 'carcinoma' to be replaced with 'cancer' and match. What did I miss?
{
"explain": true,
"track_scores": true,
"track_total_hits": true,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "carcinoma*",
"fields": [
"titles.fulltext"
]
}
}
]
}
}
}
You need to remove the wildcard (*) from the query part. Replace carcinoma*
with carcinoma
.
Modify your search query as
{
"explain": true,
"track_scores": true,
"track_total_hits": true,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "carcinoma", // note this
"fields": [
"titles.fulltext"
]
}
}
]
}
}
}