I'm using the elasticsearch-rails gem.
I'm getting the following error when a user clicks a higher page number link in my pagination. Here is the error:
Elasticsearch::Transport::Transport::Errors::InternalServerError ([500] {"error":{"root_cause":[{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [48700]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"products","node":"fNcaDjwzRRGu2fq0KjTWUQ","reason":{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [48700]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}}]},"status":500}):
Where do I set index.max_result_window or is this even the proper way to fix this issue?
Do you really need to let users paginate 48000 entries deep? The problem here lies in how elasticsearch works internally. If you query for the Nth result, elasticsearch has to fetch the first N-1 results as well just to discard them. That is why such queries get more expensive the deeper users paginate.
Scroll API might be a good fit, if you only have very few users and they all behave (as in they don't open a zillion scroll contexts at the same time and keep them alive forever) but probably not.
If it is an option, limit your pagination to these 10k results (or even less) and encourage your users to be more specific in their queries. If not, be prepared to scale the hardware in your cluster (memory!) and to have long running queries.