elasticsearchlucenepyelasticsearch

ElasticSearch query_string fails to parse query with some characters


I'm using the ElasticSearch (2.4) and the official Python client to perform simple queries. My code:

from elasticsearch import Elasticsearch

es_client = Elasticsearch("localhost:9200")
index = "indexName"
doc_type = "docType"

def search(query, search_size):
    body = {
        "fields": ["title"],
        "size": search_size,
        "query": {
            "query_string": {
                "fields": ["file.content"],
                "query": query
            }
        }
    }
    response = es_client.search(index=index, doc_type=doc_type, body=body)
    return response["hits"]["hits"]

search("python", 10) # Works fine.

The problem is when my query contains unbalanced parenthesis or brackets. For example with search("python {programming", 10) ES throws:

elasticsearch.exceptions.RequestError: TransportError(400, u'search_phase_execution_exception', u'Failed to parse query [python {programming}]')

Is that the expected behavior of ES? Doesn't it use a tokenizer to remove all those characters?

Note: This happens to me using Java too.


Solution

  • I was reading the documentation and the query_string is more strict. The following are reserved characters: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /

    So, like jhilden said, I would have to escape them or use simple_query_string instead.

    Docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html