etcdetcd3etcdctl

How to failover in etcd cluster


I have 3 nodes etcd cluster i.e. one master and two slaves. I need to bring down the master node for some maintenance activity. So, I tried conducting elections to elect a new master but it didn't work.

Below is the current state of the etcd cluster

etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status
+---------------------+------------------+---------+---------+-----------+-----------+------------+
|      ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+---------------------+------------------+---------+---------+-----------+-----------+------------+
| X.X.X.5:2379 | ac354ac61b853b35 |  3.2.26 |   25 kB |      true |        12 |         13 |
| X.X.X.6:2379 |  7f34769979eb782 |  3.2.26 |   25 kB |     false |        12 |         13 |
| X.X.X.9:2379 | 9174c96c4669dfb5 |  3.2.26 |   25 kB |     false |        12 |         13 |
+---------------------+------------------+---------+---------+-----------+-----------+------------+

Below is the command that I used to conduct the election. I ran the below from node 3 i.e. X.X.X.9 but its command got struct forever. I am new to etcd, so not sure whether I am using the command correctly

etcdctl --endpoints=$ENDPOINTS elect failover app03
failover/37827ec3fd292b03
app03

Thanks in advance


Solution

  • TL;DR;

    etcdctl --endpoints=$ENDPOINTS move-leader 9174c96c4669dfb5
    

    move-leader docs

    In few more words:

    I think that "master" and "slave" is quite wrong description of what is happening in etcd. It's much better to think of them as 3 members, where 1 of them is a leader. At any time you can take down minority of nodes (1 in 3-node cluster) and etcd will work just fine. There will be election between members that stayed in the cluster and new lider will be elected.

    More reading: https://etcd.io/docs/v3.5/op-guide/runtime-configuration/