I'm using Elasticsearch to build a small search app and am trying to figure out how to build an autocomplete feature with multi-word (phrase) suggestions. I have it working... sort of...
I get mostly single word suggestions, but when I hit the space bar - it kills the suggestions.
For example, if I type "fast" it works fine, if I type "fast " - that stops the suggestions from appearing.
I'm using Edge N Grams
and match_phrase_prefix
and have followed the examples here and here to build it out. For the _all
field in match_phrase_prefix
and just used include_in_all: false to cancel all the fields out except for title and content. I'm starting to think its just because I'm testing on a small data set and there simply aren't enough tokenized terms to produce multi-word suggestions. Please take a look at the relevant code below and advise me where I'm going wrong, if any?
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20",
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"autocomplete_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
try keyword
tokenizer
"autocomplete": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding",
"autocomplete_filter"
],
"tokenizer": "keyword"
}
for reference elasticsearch mapping tokenizer keyword to avoid splitting tokens and enable use of wildcard
Since by default its standard anaylyzer that splits on spaces
You can check your tokens like curl 'localhost:9200/test/_analyze?pretty=1&analyzer=my_edge_ngram_analyzer' -d 'FC Schalke 04'
reference https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html