dockerdocker-swarmdocker-stack

docker swarm replicates more then it should


I'm running a docker swarm to handle one application that needs to be always available but only run on one node at the time. If the application is running twice, it crashes on both nodes.

I was using docker swarm to realize that and for the most part it works perfectly, but recently I ran into the problem that docker decided to replicate the application 2 times.

ID             NAME          MODE         REPLICAS   IMAGE           PORTS
qro2usyj798l   cmdr_cmdr     replicated   2/1        some-name       

As you can see the odd thing is that it knows it replicates more than it should.

Is there a way to force docker to only ever replicate maximum 1 instance.


Solution

  • You need to perform a docker service ps to investigate which nodes the replicas are running on and why.

    This situation can happen a couple of ways:

    1. you have a service configured for "start-first". Definitionally the task will exceed the desired replica count when you apply updates, but might get stuck in this state if the new replica cannot become healthy because it cannot acquire resources the old replica is using. Because Docker does not want to interrupt service availability it will not stop the old replica until the new one is healthy, and therefore deadlock the upgrade.

    2. A worker node stops responding to the managers. As the worker node has stopped responding, the managers will (re)start any tasks hosted on the non responding worker node. However, as the worker node is not responding, and therefore cannot confirm that the old replicas have properly been stopped, Swarm will continue to track those replicas as active until the dead worker node is properly removed from the swarm, or rejoins.