kuberneteskubernetes-cronjobk8s-cronjobber

K8s Job being constantly recreated


I have a cronjob that keeps restarting, despite its RestartPolicy set to Never:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: cron-zombie-pod-killer
spec:
  schedule: "*/9 * * * *"
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        metadata:
          name: cron-zombie-pod-killer
        spec:
          containers:
            - name: cron-zombie-pod-killer
              image: bitnami/kubectl
              command:
                - "/bin/sh"
              args:
                - "-c"
                - "kubectl get pods --all-namespaces --field-selector=status.phase=Failed | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod > /dev/null"
          serviceAccountName: pod-read-and-delete
          restartPolicy: Never

I would expect it to run every 9th minute, but that's not the case. What happens is that when there are pods to clean up (so, when there's smth to do for the pod) it would run normally. Once everything is cleared up, it keeps restarting -> failing -> starting, etc. in a loop every second.

Is there something I need to do to tell k8s that the job has been successful, even if there's nothing to do (no pods to clean up)? What makes the job loop in restarts and failures?


Solution

  • That is by design. restartPolicy is not applied to a CronJob, but a Pod it creates.

    If restartPolicy is set to Never, it will ust create new pods, if the previous failed. Setting it to OnFailure causes the Pod to be restarted, and prevents the stream of new Pods.

    This was discussed in this GitHub issue: Job being constanly recreated despite RestartPolicy: Never #20255


    Your kubectl command results in exit code 123 (any invocation exited with a non-zero status) if there are no Pods in Failed state. This causes the Job to fail, and constant restarts.

    You can fix that by forcing kubectl command to exit with exit code 0. Add || exit 0 to the end of it:

    kubectl get pods --all-namespaces --field-selector=status.phase=Failed | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod > /dev/null || exit 0