kubernetesavailability

How should application using active/passive redundant model be containerized using Kubernetes?


I have a distributed application running on virtual machines, among which I have one service running on active/passive mode. The active VM provides service via a public IP. Should the active VM fail, the public IP will be moved to the passive VM and the passive VM will become active and starts to provide service.

How this pattern fit in containerized application managed by kubernetes?

If I use a replication controller with replicas =1, in case of node/minion failure, the replication controller will reschedule the pod(= VM in my current application) in another minion, but this would likely cause high downtime compared with my current solution where only IP resource is moved.

If I use a replication controller with replicas=2, then I would need to have a different configuration with two pods (one with public IP, the other without) which is anti-pattern? Furthermore, there is no designed way in kubernetes to support virtual IP(move around pods.)?

OR should I use replicas =2 and implement something myself to manage the IP(or maybe make use of pacemaker? this would introduce another problem: there will be to cluster management in my application, kubernetes, and pacemaker/corosync)

So, how this should be done?


Solution

  • It sounds like your application is using its own master election scheme between the two VMs acting as a load balancer and you know internally which one is currently the master.

    This can be achieved today in Kubernetes using a service that spans both pods (master and standby) and a readiness probe that only returns success for the currently active master. Failure of a readiness probe removes the pod from the endpoints list, so no traffic will be directed to the node that isn't the master. When you need to do failover, the standby would report healthy to the readiness probe (and the master would report unhealthy or be unreachable) at which point traffic to the service would only land on the standby (now acting as the master).

    You can create the service that spans the two pods with an external IP such that it is reachable from outside of your cluster.