I'm using the ElasticSearch (2.4) and the official Python client to perform simple queries. My code:
from elasticsearch import Elasticsearch
es_client = Elasticsearch("localhost:9200")
index = "indexName"
doc_type = "docType"
def search(query, search_size):
body = {
"fields": ["title"],
"size": search_size,
"query": {
"query_string": {
"fields": ["file.content"],
"query": query
}
}
}
response = es_client.search(index=index, doc_type=doc_type, body=body)
return response["hits"]["hits"]
search("python", 10) # Works fine.
The problem is when my query contains unbalanced parenthesis or brackets. For example with search("python {programming", 10)
ES throws:
elasticsearch.exceptions.RequestError: TransportError(400, u'search_phase_execution_exception', u'Failed to parse query [python {programming}]')
Is that the expected behavior of ES? Doesn't it use a tokenizer to remove all those characters?
Note: This happens to me using Java too.
I was reading the documentation and the query_string
is more strict. The following are reserved characters: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
So, like jhilden said, I would have to escape them or use simple_query_string
instead.
Docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html