solrsolr5

Correctly distribute replicas across nodes using


Using Solr v5.5

We are trying to balance our shard replica placement using this: https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement

We have defined a rule 'replica:<5,node:*'

Our setup: 10 Solr instances 20 shards 2 replication factor

So what we want is for each instance to host 4 different shards, acting as primary for 2, and replica for 2 more. With our rule, this almost works, though there are always a couple of instances that end up hosting both replicas for 1 or 2 shards. e.g.

instance0: shard1-replica1
           shard1-replica2
           shard2-replica1
           shard2-replica2
instance1: shard3-replica1
           shard3-replica2
           shard4-replica1
           shard5-replica2                  

Any ideas how we can improve the rule so that it prevents this sort of collision?


Solution

  • So after doing some testing with different rules, we found that the rule

    shard:*,replica:4,node:*
    

    mostly fixed the issue. This seems to consider each shard in the cluster, across multiple collections. We have two collections, so although the rule force 4 different shards per node for a collection, it doesn't enforce 2 leads + 2 follows per node, which it did before.

    The redundancy is more important than the balance of leaders/follows per node, and performance seems to be consistent, so this solution is good enough.