I have an index with Name field . I want to use soundex analyzer and synonym analyzer on that field.
I want to achieve both in a single index .Is it even possible ? Please help me experts out there
{
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "phonetic_sample",
"creation_date": "1603097131476",
"analysis": {
"filter": {
"my_soundex": {
"replace": "false",
"type": "phonetic",
"encoder": "soundex"
}
},
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase",
"my_soundex"
],
"tokenizer": "standard"
}
}
}
I query for Catherine and match Catherine,Katherine and Kathryn
{
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "phonetic_synonym",
"creation_date": "1603121439096",
"analysis": {
"filter": {
"synonym": {
"format": "wordnet",
"type": "synonym",
"synonyms": [
"s(100000001,1,'Bill',v,1,0).",
"s(100000001,2,'William',v,1,0).",
"s(100000001,3,'Wilhelm',v,1,0)."
]
}
},
"analyzer": {
"synonym": {
"filter": [
"synonym"
],
"tokenizer": "whitespace"
}
}
}
I query for Bill and match Bill, William and Wilhelm
You can use multi-field with multiple analyzers. You can declare sub-fields for the
name
field, each with a different analyzer.
Below is the modified index mapping.
Index Mapping:
{
"settings": {
"index": {
"analysis": {
"filter": {
"my_soundex": {
"type": "phonetic",
"encoder": "metaphone",
"replace": false
},
"synonym": {
"format": "wordnet",
"type": "synonym",
"synonyms": [
"s(100000001,1,'Bill',v,1,0).",
"s(100000001,2,'William',v,1,0).",
"s(100000001,3,'Wilhelm',v,1,0)."
]
}
},
"analyzer": {
"synonym": {
"filter": [
"synonym"
],
"tokenizer": "whitespace"
},
"my_analyzer": {
"filter": [
"lowercase",
"my_soundex"
],
"tokenizer": "standard"
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analzyer": "synonym",
"search_analyzer": "synonym",
"fields": {
"content": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
}
}
}
}
Then you can refer to name
and name.content
in your queries. Your search query will be like this:
{
"query": {
"multi_match": {
"query": "Bill",
"fields": [
"name",
"name.content"
],
"type": "most_fields"
}
}
}