elasticsearcholtp

ElasticSearch deleted documents taking more space


We are facing an issue with our ES. Merging is not happening as quickly as ingestion/updates and as a result we have a huge number of deleted documents that take an additional 65% space. I have read that merging happens automatically and we can also force with a ES command but both doesn't seem working unless I stop item ingestion/updates. ES gives great performance for our aggregation queries on millions of items and so we are using it as our primary DB.

We switched from ES 2.X to ES 5.5 and still have this issue.

I have played with forcemerge, shard size, shard count & stopping ingestion. Only the last one worked.

Is there any way for us to reduce this deleted documents count without stopping item ingestion/updates?


Solution

  • On Elasticsearch 5.x there should be a mechanism to backpressure index rates, if merging falls behind. Wondering if this is the case or not. There is another setting you can play around with, that decides when a merge should be triggered based on the number of deletes. You can configure this as part of the merge policy, see

    https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/index/MergePolicyConfig.java