pythonelasticsearchelasticsearch-dslelasticsearch-dsl-py

Python elasticsearch-dsl making query dynamically


I'm new to elastic search, I'm implementing it in python using the elasticsearch-dsl library.

I'm stuck at creating queries dynamically.

for example, I can simply do something like this directly,

q = (Q('match', age=21) | Q('match', gender='male')) & (~Q('match', name='Stevens'))

but how can I make such query dynamically?

I tried something like this:

# simple json input, based on which query should be changed
json_input = {'fields': ['age', 'gender'], 'values': {'age': 21, 'gender': 'male'}}

age_query = None
gender_query = None
name_query = None

if 'age' in json_input['fields']:
    age_query = Q('match', age=json_input['values']['age'])

if 'gender' in json_input['fields']:
    gender_query = Q('match', age=json_input['values']['gender'])

if 'name' in json_input['fields']:
    name_query = ~Q('match', age=json_input['values']['name'])

q = Q()
if gender_query or age_query:
    q = (age_query | gender_query)
if name_query:
    q &= name_query

But when I execute search as:

s = Search(using=client, index=index)
s = s.query(q)
response = s.execute()

It gives error while executing the search,

RequestError: RequestError(400, u'search_phase_execution_exception', u'failed to create query: {\n  "bool" : {\n    "should" : [\n      {\n        "match" : {\n          "age" : {\n            "query" : 21,\n            "operator" : "OR",\n            "prefix_length" : 0,\n            "max_expansions" : 50,\n            "fuzzy_transpositions" : true,\n            "lenient" : false,\n            "zero_terms_query" : "NONE",\n            "auto_generate_synonyms_phrase_query" : true,\n            "boost" : 1.0\n          }\n        }\n      },\n      {\n        "match" : {\n          "age" : {\n            "query" : "male",\n            "operator" : "OR",\n            "prefix_length" : 0,\n            "max_expansions" : 50,\n            "fuzzy_transpositions" : true,\n            "lenient" : false,\n            "zero_terms_query" : "NONE",\n            "auto_generate_synonyms_phrase_query" : true,\n            "boost" : 1.0\n          }\n        }\n      }\n    ],\n    "adjust_pure_negative" : true,\n    "boost" : 1.0\n  }\n}')

I want to know what's wrong with this query generation, and is there any more appropriate way to do so?

P.S. It's generating correct JSON if I do s.to_dict() as shown below:

{'query': {'bool': {'should': [{'match': {'age': 21}},
                           {'match': {'age': 'male'}}]}}}

Solution

  • I think the second and third if's are not using the correct field:

    if 'gender' in json_input['fields']:
        gender_query = Q('match', gender=json_input['values']['gender'])
                                    ^
                                    |
                          change this field name
    
    
    if 'name' in json_input['fields']:
        name_query = ~Q('match', name=json_input['values']['name'])
                                   ^
                                   |
                         change this field name