elasticsearchelasticsearch-6

Max value of number_of_routing_shards in Elasticsearch 6.x


What is the max recommended value of number_of_routing_shards for an index?

Can I specify a very high value like 30000? What are the side effects if I do so?


Solution

  • Shards are "slices" of an index created by elasticsearch to have flexibility to distribute indexed data. For example, among several datanodes.

    Shards, in the low level are independent sets of lucene segments that work autonomously, which can be queried independently. This makes possible the high performance because search operations can be split into independent processes.

    The more shards you have the more flexible becomes the storage assignment for a given index. This obviously has some caveats.

    Distributed searches must wait each other to merge step-results into a consistent response. If there are many shards, the query must be sliced into more parts, (which has a computing overhead). The query is distributed to each shard, whose hashes match any of the current search (not all shards are necesary hit by every query) therefore the most busy (slower) shard, will define the overall performance of your search.

    It's better to have a balanced number of indexes. Each index has a memory footprint that is stored in the cluster state. The more indexes you have the bigger the cluster state, the more time it takes to be shared among all cluster nodes. The more shards an index has, the complexer it becomes, therefore the size taken to serialize it into the cluster state is bigger, slowing things down globally.