docker kubernetes kubernetes-helm kubernetes-pvc

Allow scheduling multiple pod when we have anti affinity enabled

I have a deployment where I have added affinity as below -

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - example.com
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: component
          operator: In
          values:
          - myapp
      topologyKey: "kubernetes.io/hostname"

Now, whenever I update configurations in this pod, Upgraded pod is not being scheduled due to error -

Warning FailedScheduling 12s default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod

Please provide suggestions, how can I fix this issue.

Tried preferredDuringSchedulingIgnoredDuring execution method, but no luck

I have 3 nodes in my cluster.

Solution

Given that you have 3 replicas and 3 nodes in your cluster, it seems like the pods are evenly distributed across the nodes. However, when you update a configuration, a new pod is created, and the Kubernetes scheduler tries to place it on a node where no other pod with the component label myapp is running. If all nodes already have a pod with this label, the new pod cannot be scheduled, leading to the error message you're seeing.

To address this issue, please consider the following options;

Use preferredDuringSchedulingIgnoredDuringExecution for pod anti-affinity to specify that the anti-affinity rule is a "soft" requirement, rather than a "hard" requirement.

Example;

affinity:
 podAntiAffinity:
 preferredDuringSchedulingIgnoredDuringExecution:
 - weight: 100
   podAffinityTerm:
     labelSelector:
       matchExpressions:
       - key: component
         operator: In
         values:
         - myapp
     topologyKey: "kubernetes.io/hostname"

Adjust the maxUnavailable parameter in your deployment strategy.

Example;

strategy:
 type: RollingUpdate
 rollingUpdate:
 maxUnavailable: 1

In the 2nd example, the maxUnavailable value is set to 1, which means that Kubernetes can evict one pod to make room for a new one. This should allow the new pod to be scheduled, even if it means violating the anti-affinity rule.

If the 1 & 2 solutions don't work, I suggest if not require to scale down your application to have fewer than 3 pods running at the same time (e.i, kubectl scale deployment myapp --replicas=2). This would allow the new pod to be scheduled on one of the nodes that currently has a pod running.