I'm trying this on a local 1.7.5 elasticsearch installation
http://localhost:9200/_analyze?filter=shingle&tokenizer=keyword&text=alkis stack
I see this
{
"tokens":[
{
"token":"alkis stack",
"start_offset":0,
"end_offset":11,
"type":"word",
"position":1
}
]
}
And I expected to see something like this
{
"tokens":[
{
"token":"alkis stack",
"start_offset":0,
"end_offset":11,
"type":"word",
"position":1
},
{
"token":"stack alkis",
"start_offset":0,
"end_offset":11,
"type":"word",
"position":1
}
]
}
Am I missing something?
Update
{
"number_of_shards": 2,
"number_of_replicas": 0,
"analysis": {
"char_filter": {
"map_special_chars": {
"type": "mapping",
"mappings": [
"- => \\u0020",
". => \\u0020",
"? => \\u0020",
", => \\u0020",
"` => \\u0020",
"' => \\u0020",
"\" => \\u0020"
]
}
},
"filter": {
"permutate_fullname": {
"type": "shingle",
"max_shingle_size": 4,
"min_shingle_size": 2,
"output_unigrams": true,
"token_separator": " ",
"filler_token": "_"
}
},
"analyzer": {
"fullname_analyzer_search": {
"char_filter": [
"map_special_chars"
],
"filter": [
"asciifolding",
"lowercase",
"trim"
],
"type": "custom",
"tokenizer": "keyword"
},
"fullname_analyzer_index": {
"char_filter": [
"map_special_chars"
],
"filter": [
"asciifolding",
"lowercase",
"trim",
"permutate_fullname"
],
"type": "custom",
"tokenizer": "keyword"
}
}
}
}
And I'm trying to test like this
http://localhost:9200/INDEX_NAME/_analyze?analyzer=fullname_analyzer_index&text=alkis stack
Index first name and last name in two separate fields in ES, just as you have them in the DB. The text received as query can be analyzed (match
does it for example, query_string
does it). And there are ways to search both fields at the same time with all the terms in the search string. I think you are over-complicating the use case with single name in one go and creating names permutations at indexing time.