pythoncurlelasticsearchpyeselasticutils

ElasticSearch with filter via elasticutils


I'm currently trying to use a filter in an existing ElasticSearch instance via the library elasticutils. I'm getting nowhere, unfortunately. I'm not sure if the problem is because I did something basic wrong or if there's a problem in the library (could well be, AFAICT).

I've got an index with a specific mapping, containing a field (say "A") of type string (no explicit analyzer given). That field always contains a list of strings.

I'd like to filter my documents by containing a given string in that field A, so I tried:

import elasticutils as eu
es = eu.S().es(urls=[ URL ]).indexes(INDEX).doctypes(DOCTYPE)
f = eu.F(A="text")
result = es.filter(f)

But that returns an empty result set. I also tried it using f = eu.F(A__in="text") but that resulted in a large error message, the most intriguing part of it being [terms] filter does not support [A].

I'm wondering if I have to configure my index differently, maybe I have to create a facet to be able to use filter? But I didn't find any hint on this in the documentation I read.

My reason for wanting to use filter is that they can be combined freely using and, or, and not. I also found some specs describing that query also can be boolean, but they typically refer to must, should, and must_not which aren't flexible enough for me I think. But I also found some specs which mentioned an operator flag for querys which can be set to and or or. Any info on that is welcome.

So, my questions now are:


Solution

  • airza hit the nail on the head with his answer in terms of the filter you're looking for, in CURL format. I suspect the issues you're seeing are largely due to using an abstraction module like elasticutils - it would be good to get familiar with the underlying ES querying protocol first. It will make understanding elasticutils easier. As in my comment above, I recommend installing 'Sense', a plugin for Google Chrome that let's you easily query your ES cluster: https://chrome.google.com/webstore/detail/sense/doinijnbnggojdlcjifpdckfokbbfpbo?hl=en.

    Elasticsearch query filters are extremely flexible - and 'nestable'. You can quite easily nest an or filter inside of a bool must filter. Example:

    {
        "query": {
            "filtered": {
               "query": {
                   "match_all": {}
               },
               "filter": {
                   "bool": {
                       "must": [
                           {
                               "or": [
                                     {"exists": {"field": "sessions"}},
                                     {"range": {"id": {"gte": 56000}}}
                               ]
                           },
                           {
                               "term": {"age_min": "13"}
                           }
                       ],
                       "should": [
                          {
                              "term": {"area": "1"}
                          }
                       ]
                   }
               }
            }
        }
    }
    

    In this example, results must match one of the two must or filters and the age_min term filter, and items matching the area term filter in the should clause will rank higher than non-matching items.