I am running 3 zookeeper kubernetes statefulset for my kafka clusters. Named zookeeper-0 , zookeeper-1, zookeeper-2. And I have liveness probe enabled with ruok command. If any statefulset pod get restarted due to any failures the quorum will fail and zookeeper stops working even after the failed pod started up and its liveness probe responds ok.
When this happens I have to manually restart all zookeeper instances to get it back to working. This also happens when I do a helm upgrade. Because when I do a helm upgrade the first instance getting restarted is zookeeper-2 , then zookeeper-1 and finally zookeeper-0. But it seems zookeeper will only work if I start all instances together. So after every helm upgrade I have to restart all instances manually.
My question is:
What could be the reason for this behaviour? Also what is the best way to ensure a 100% reliable statefulset zookeeper in kubernetes environment?
This is a bug in zookeeper v 3.5.8 , https://issues.apache.org/jira/browse/ZOOKEEPER-3829 . If anyone comes across this issue please use the latest zookeeper.
When you do helm upgrade , the instances will be deployed in the reverse order. So if you have zookeeper-0, zookeeper-1 , zookeeper-2 . Then zookeeper-2 will get upgraded first then zookeeper-1 and finally zookeeper-0. This issue is very evident if you have 5 instances.
So to get around this instability issue I added liveness and readiness probe. So zookeeper will get restarted automatically when this issue happens and self heals.
For Kubernetes Statefulset::
livenessProbe:
exec:
command:
- sh
- -c
- /bin/liveness-check.sh 2181
initialDelaySeconds: 120 #Default 0
periodSeconds: 60
timeoutSeconds: 10
failureThreshold: 2
successThreshold: 1
readinessProbe:
exec:
command:
- sh
- -c
- /bin/readiness-check.sh 2181
initialDelaySeconds: 20
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 2
successThreshold: 1
Where scripts are defined in Kubernetes ConfigMap
data:
liveness-check.sh: |
#!/bin/sh
zkServer.sh status
readiness-check.sh: |
#!/bin/sh
echo ruok | nc 127.0.0.1 $1
If you don't have zkServer.sh script in the docker image for some reason you can define the config map as following
readiness-check.sh: |
#!/bin/sh
OK=$(echo ruok | nc 127.0.0.1 $1)
if [ "$OK" == "imok" ]; then
exit 0
else
echo "Liveness Check Failed for ZooKeeper"
exit 1
fi
liveness-check.sh: |
#!/bin/sh
echo stat | nc 127.0.0.1 $1 | grep Mode