elasticsearchsharding

What are the risks of large shards in Elasticsearch?


At my workplace each of our ES indices is configured to have exactly 5 shards and we make no use of the Rollover API or ILM. Most of our indices are quite small, but we have one large index where each individual shard is close to 250 gigabytes. There is now discussion ingesting additional data that will roughly double the size of that index.

I'm trying to pump the breaks on this because from my understanding of best practices (e.g. those described by Elastic Co. here) shards should ideally be <=50GB. My understanding of the risks involved with letting shards get too big:

Are these accurate? Are there others risks that I should be aware of? I'm also a bit concerned that as the shards large memory issues could come into play and the entire cluster could become unstable. Is that a well-founded concern?


Solution

  • that's the some bad sides of the large amount of shards. In practice you can face with many other issues. So you should plan shard number correctly beforehand.