I use Kibana to execute query elastic (Query string query).
When i search a word include escapable characters (reserved characters like: '\', '+', '-', '&&', '||', '!', '(', ')', '{', '}', '[', ']', '^', '"', '~', '*', '?', ':', '/'). It will get expected result. My example use: '!'
But when i search single reserved character. I got nothing. Or:
How can i search with single reserved character?
!
won't be stripped away during the ingestion phase.In the first screenshot you've correctly tried running _analyze
. Let's use it to our advantage.
See, when you don't specify any analyzer, ES will default to the standard
analyzer which is, by definition, constrained by the standard
tokenizer which'll strip away any special chars (except the apostrophe '
and some other chars).
Running
GET dev_application/_analyze?filter_path=tokens.token
{
"tokenizer": "standard",
"text": "Se, det ble grønt ! a"
}
thus yields:
["Se", "det", "ble", "grønt", "a"]
This means you'll need to use some other tokenizer which'll preserve these chars instead. There are a few built-in ones available, the simplest of which would be the whitespace
tokenizer.
Running
GET _analyze?filter_path=tokens.token
{
"tokenizer": "whitespace",
"text": "Se, det ble grønt ! a"
}
retains the !
:
["Se,", "det", "ble", "grønt", "!", "a"]
So,
DELETE dev_application
(I chose the multi-field approach which'll preserve the original, standard analyzer and only apply the whitespace
tokenizer on the name.splitByWhitespace
subfield.)
PUT dev_application
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"splitByWhitespaceAnalyzer": {
"tokenizer": "whitespace"
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"splitByWhitespace": {
"type": "text",
"analyzer": "splitByWhitespaceAnalyzer"
}
}
}
}
}
}
POST dev_application/_doc
{
"name": "Se, det ble grønt ! a"
}
GET dev_application/_search
{
"query": {
"query_string": {
"default_field": "name.splitByWhitespace",
"query": "*\\!*",
"default_operator": "AND"
}
}
}
Do note that if you leave the default_field
out, it won't work because of the standard
analyzer.
Indeed, you could reverse this approach, apply whitespace
by default, and create a multi-field mapping for the "original" indexing strategy (-> the only config being "type": "text"
).
Shameless plug: I wrote a book on Elasticsearch and you may find it useful!