dockerkuberneteskubernetes-helmkubernetes-pvc

Allow scheduling multiple pod when we have anti affinity enabled


I have a deployment where I have added affinity as below -

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - example.com
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: component
          operator: In
          values:
          - myapp
      topologyKey: "kubernetes.io/hostname"

Now, whenever I update configurations in this pod, Upgraded pod is not being scheduled due to error -

Warning FailedScheduling 12s default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod

Please provide suggestions, how can I fix this issue.

Tried preferredDuringSchedulingIgnoredDuring execution method, but no luck

I have 3 nodes in my cluster.


Solution

  • Given that you have 3 replicas and 3 nodes in your cluster, it seems like the pods are evenly distributed across the nodes. However, when you update a configuration, a new pod is created, and the Kubernetes scheduler tries to place it on a node where no other pod with the component label myapp is running. If all nodes already have a pod with this label, the new pod cannot be scheduled, leading to the error message you're seeing.

    To address this issue, please consider the following options;

    1. Use preferredDuringSchedulingIgnoredDuringExecution for pod anti-affinity to specify that the anti-affinity rule is a "soft" requirement, rather than a "hard" requirement.

    Example;

    affinity:
     podAntiAffinity:
     preferredDuringSchedulingIgnoredDuringExecution:
     - weight: 100
       podAffinityTerm:
         labelSelector:
           matchExpressions:
           - key: component
             operator: In
             values:
             - myapp
         topologyKey: "kubernetes.io/hostname"
    
    1. Adjust the maxUnavailable parameter in your deployment strategy.

    Example;

    strategy:
     type: RollingUpdate
     rollingUpdate:
     maxUnavailable: 1
    

    In the 2nd example, the maxUnavailable value is set to 1, which means that Kubernetes can evict one pod to make room for a new one. This should allow the new pod to be scheduled, even if it means violating the anti-affinity rule.

    If the 1 & 2 solutions don't work, I suggest if not require to scale down your application to have fewer than 3 pods running at the same time (e.i, kubectl scale deployment myapp --replicas=2). This would allow the new pod to be scheduled on one of the nodes that currently has a pod running.