The tutorial mentions that controllers can be deployed to manage a large number of clusters, but there is no doc/tutorial about it. From the code and the examples, it seems that to create a controller, I always need to pass the clusterName in.
How can I set up the controllers to let them manage more than one cluster and handle failure cases automatically?
We will need to create a document for how to set it.
The general idea of "Helix controller as service" is that you need to create a controller cluster (aka: super cluster) which holds all Helix controller instances. And then link your cluster which to be managed by Helix to this super cluster.
Sample steps to set up it is as follows:
git clone git://git.apache.org/helix.git
cd helix
mvn clean install package -DskipTests
cd helix-core/target/helix-core-pkg/bin
chmod +x ./helix-admin.sh
./helix-admin.sh --addCluster mySuperCluster --zkSvr <ZKSERVER:PORT>
./helix-admin.sh --addNode mySuperCluster myController-1_12345 --zkSvr <ZKSERVER:PORT>
./helix-admin.sh --addNode mySuperCluster myController-2_12345 --zkSvr <ZKSERVER:PORT>
./helix-admin.sh --addNode mySuperCluster myController-3_12345 --zkSvr <ZKSERVER:PORT>
./run-helix-controller.sh --cluster mySuperCluster --mode DISTRIBUTED --controllerName myController-1_12345 --zkSvr <ZKSERVER:PORT>
./run-helix-controller.sh --cluster mySuperCluster --mode DISTRIBUTED --controllerName myController-2_12345 --zkSvr <ZKSERVER:PORT>
./run-helix-controller.sh --cluster mySuperCluster --mode DISTRIBUTED --controllerName myController-3_12345 --zkSvr <ZKSERVER:PORT>
Now your super cluster has been setup and live.
Suppose you now have two clusters (say storageCluster-1 and storageCluster-2) you would like to be managed by Helix, you can link these two clusters to your super cluster in the following way:
./helix-admin.sh --activateCluster storageCluster-1 mySuperCluster true --zkSvr <ZKSERVER:PORT>
./helix-admin.sh --activateCluster storageCluster-2 mySuperCluster true --zkSvr <ZKSERVER:PORT>
Now both of your clusters will be managed by one of Helix controllers from the superCluster. In case of one controller dies, Helix will automatically switches to another controller for your clusters.