pythonjsonelasticsearchelasticsearch-py

Sequential Searching Across Multiple Indexes In Elasticsearch


Suppose I have Elasticsearch indexes in the following order:

index-2022-04
index-2022-05
index-2022-06
...

index-2022-04 represents the data stored in the month of April 2022, index-2022-05 represents the data stored in the month of May 2022, and so on. Now let's say in my query payload, I have the following timestamp range:

"range": {
    "timestampRange": {
        "gte": "2022-04-05T01:00:00.708363",  
        "lte": "2022-06-06T23:00:00.373772"                 
    }
}

The above range states that I want to query the data that exists between the 5th of April till the 6th of May. That would mean that I have to query for the data inside three indexes, index-2022-04, index-2022-05 and index-2022-06. Is there a simple and efficient way of performing this query across those three indexes without having to query for each index one-by-one?

I am using Python to handle the query, and I am aware that I can query across different indexes at the same time (see this SO post). Any tips or pointers would be helpful, thanks.


Solution

  • You simply need to define an alias over your indices and query the alias instead of the indexes and let ES figure out which underlying indexes it needs to visit.

    Eventually, for increased search performance, you can also configure index-time sorting on timestampRange, so that if your alias spans a full year of indexes, ES knows to visit only three of them based on the range constraint in your query (2022-04-05 -> 2022-04-05).