My ES cluster has 20 machines with 50 nodes(ES instances), I'm not sure how many racks should I set. Is two racks enough? or 3 or 4 racks better.
As I know if I set rack_id in ES configuration, it can provide the following functions:
1, Select data location or relocation(to make sure replicas in different racks)
2, Use Rack_id as doc routing
Any reasons should I set more racks, but I think even just one rack by default is good too.
The chance of an outage of two machines is highest if they share hardware because you use VMs, smaller if they share a rack but not hardware, and again smaller if they share a building but not a rack. So it makes sense to use more than a single rack.
Whether you need more than 2 racks depends on your replicas. The default number of replications is 1. If you require a higher value, strictly speaking you will degrade the Availability of your cluster a bit if you use only 2 racks because the >= 3 setting will not be effective on the rack level.