elasticsearchelasticsearch-2.0

ElasticSearch get only document ids, _id field, using search query on index


For a given query I want to get only the list of _id values without getting any other information (without _source, _index, _type, ...).

I noticed that by using _source and requesting non-existing fields it will return only minimal data but can I get even less data in return ? Some answers suggest to use the hits part of the response, but I do not want the other info.


Solution

  • Better to use scroll and scan to get the result list so elasticsearch doesn't have to rank and sort the results.

    With the elasticsearch-dsl python lib this can be accomplished by:

    from elasticsearch import Elasticsearch
    from elasticsearch_dsl import Search
    
    es = Elasticsearch()
    s = Search(using=es, index=ES_INDEX, doc_type=DOC_TYPE)
    
    s = s.fields([])  # only get ids, otherwise `fields` takes a list of field names
    ids = [h.meta.id for h in s.scan()]