I am brand new to using Elasticsearch and I'm having an issue getting all results back when I run an Elasticsearch query through my Python script. My goal is to query an index ("my_index" below), take those results, and put them into a pandas DataFrame which goes through a Django app and eventually ends up in a Word document.
My code is:
es = Elasticsearch()
logs_index = "my_index"
logs = es.search(index=logs_index,body=my_query)
and it tells me I have 72 hits, but then when I do:
df = logs['hits']['hits']
len(df)
It says the length is only 10. I saw someone had a similar issue on this question but their solution did not work for me.
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
es = Elasticsearch()
logs_index = "my_index"
search = Search(using=es)
total = search.count()
search = search[0:total]
logs = es.search(index=logs_index,body=my_query)
len(logs['hits']['hits'])
The len function still says I only have 10 results. What am I doing wrong, or what else can I do to get all 72 results back?
ETA: I am aware that I can just add "size": 10000 to my query to stop it from truncating to just 10, but since the user will be entering their search query I need to find another way that isn't just in the search query.
You need to pass a size
parameter to your es.search()
call.
Please read the API Docs
size – Number of hits to return (default: 10)
An example:
es.search(index=logs_index, body=my_query, size=1000)
Please note that this is not an optimal way to get all index documents or a query that returns a lot of documents. For that you should do a scroll
operation which is also documented in the API Docs provided under the scan() abstraction for scroll
Elastic Operation.
You can also read about it in elasticsearch documentation