kuberneteskube-scheduler

how to use kubernetes scheduler.alpha.kubernetes.io/preferAvoidPods?


First all of, for some reasons, I'm using an unsupported and obsolete version of Kubernetes (1.12), and I can't upgrade.

I'm trying to configure the scheduler to avoid running pods on some nodes by changing the node score when the scheduler try to find the best available node, and I would like to do that on scheduler level and not by using nodeAffinity at deployment, replicaset, pod, etc level (therefore all pods will be affected by this change).

After reading the k8s docs here: https://kubernetes.io/docs/reference/scheduling/config/#scheduling-plugins and checking that some options were already present in 1.12, I'm trying to use the NodePreferAvoidPods plugins. In the documentation the plugin specifies:

Scores nodes according to the node annotation scheduler.alpha.kubernetes.io/preferAvoidPods

Which if understand correctly should do the work.

So, i've updated the static manifest for kube-scheduler.yaml to use the following config:

apiVersion: kubescheduler.config.k8s.io/v1alpha1
kind: KubeSchedulerConfiguration
profiles:
    - plugins:
        score:
          enabled:
          - name: NodePreferAvoidPods
            weight: 100
clientConnection:
  kubeconfig: /etc/kubernetes/scheduler.conf

But adding the following annotation scheduler.alpha.kubernetes.io/preferAvoidPods: to the node doesn't seem to work.
For testing I'm made a basic nginx deployment with a replica equal to the number of worker nodes (4).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 4
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
        - containerPort: 80

Then I check where the pods where created with kubectl get pods -owide

So, I believe some options are required for this annotation to works.
I've tried to set the annotation to "true", "1" but k8s refuse my change and I can't figure what are the valid options for this annotation and I can't find any documentation about that.

I've checked within git release for 1.12, this plugin was already present (at least there are some lines of codes), I don't think the behavior or settings changed much since.

Thanks.


Solution

  • So from source Kubernetes codes here a valid value for this annoation:

                                {
                                    "preferAvoidPods": [
                                        {
                                            "podSignature": {
                                                "podController": {
                                                        "apiVersion": "v1",
                                                        "kind": "ReplicationController",
                                                        "name": "foo",
                                                        "uid": "abcdef123456",
                                                        "controller": true
                                                }
                                            },
                                            "reason": "some reason",
                                            "message": "some message"
                                        }
                                    ]
                                }`
    

    But there is no details on how to predict the uid and no answer where given when asked by another one on github years ago: https://github.com/kubernetes/kubernetes/issues/41630

    For my initial question which was to avoid scheduling pods on node, I found an other method by using the well-known taint node.kubernetes.io/unschedulable and the value PreferNoSchedule

    Tainting a node with this command do the job and this taint seem persistent across cordon/uncordon (a cordon will set to NoSchedule and uncordon will set it back to PreferNoSchedule).

    kubectl taint node NODE_NAME node.kubernetes.io/unschedulable=:PreferNoSchedule