pythonelasticsearchelasticsearch-dslelasticsearch-dsl-py

Elasticsearch DSL, filter list of objects with in list of values


my data looks like this:

[
    {
        "id": "00f0bbe514dcaf262c8a",
        "status": "CL",
        "type": "opportunity",
        "locations": [
            {
                "name": "New York, USA",
                "lat": 99.0853,
                "lng": 99.7818,
                "id": "456",
                "type": "CI"
            },
            {
                "name": "Boston, USA",
                "lat": 80.0853,
                "lng": 80.7818,
                "id": "555",
                "type": "CI"
            },
            {
                "name": "London, UK",
                "lat": 10.0853,
                "lng": 10.7818,
                "id": "999",
                "type": "CI"
            }
        ]
    },
    {
        "id": "sadl9asod01",
        "status": "CL",
        "type": "opportunity",
        "locations": [
            {
                "name": "Boston, USA",
                "lat": 80.0853,
                "lng": 80.7818,
                "id": "555",
                "type": "CI"
            },
        ]
    },
    {
        "id": "13094ulk",
        "status": "CL",
        "type": "project",  # has right location but not type
        "locations": [
            {
                "name": "Boston, USA",
                "lat": 80.0853,
                "lng": 80.7818,
                "id": "555",
                "type": "CI"
            },
        ]
    }

]

I want to build a query that the type must be opportunity:

type_q = ElasticQ('bool', must=[ElasticQ('match', type='opportunity')])
query = self.index.search().query(type_q)

I know how to build an "in" query with the dsl, for example:

excluded_ids = self._excluded_jobs() # list
query = query.exclude('terms', id=excluded_ids)

but, how can I add to the query what in SQL I would do like this:

WHERE type='opportunity' 
AND 
location.id in (1, 2, 3)

Solution

  • Something like:

    type_q = ElasticQ('bool', must=[
      ElasticQ('match', type='opportunity'),
      ElasticQ('terms', id=excluded_ids),
    ])
    

    Or, if you actually wanted to exclude those IDs:

    type_q = ElasticQ('bool', 
                      must=[ElasticQ('match', type='opportunity')]
                      must_not=[ElasticQ('terms', id=excluded_ids)]
    )