I am having some issues with a Kubernetes CronJob running two containers inside of a GKE cluster.
One of the two containers is actually executing the job that must be done by the CronJob.
This works perfectly fine. It is started when it is supposed to be started, does the job and then terminates. All fine up until this point.
What seems to be causing some issues is the second container, which is a sidecar container used to access a database instance. This won't terminate and seems to be leading to the problem that the CronJob itself won't terminate. Which is an issue, since I see an accumulation of running Job instances over time.
Is there a way to configure a Kubernetes batch CronJob to be terminating when one of the container is successfully exe
apiVersion: batch/v1
kind: CronJob
metadata:
name: chron-job-with-a-sidecar
namespace: my-namespace
spec:
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │ 7 is also Sunday on some systems)
# │ │ │ │ │ OR sun, mon, tue, wed, thu, fri, sat
# │ │ │ │ │
schedule: "0 8 * * *" # -> Every day At 8AM
jobTemplate:
metadata:
labels:
app: my-label
spec:
template:
containers:
# --- JOB CONTAINER -----------------------------------------------
- image: my-job-image:latest
imagePullPolicy: Always
name: my-job
command:
- /bin/sh
- -c
- /some-script.sh; exit 0;
# --- SIDECAR CONTAINER ----------------------------------------------
- command:
- "/cloud_sql_proxy"
- "-instances=my-instance:antarctica-south-3:user=tcp:1234"
# ... some other settings ...
image: gcr.io/cloudsql-docker/gce-proxy:1.30.0
imagePullPolicy: Always
name: cloudsql-proxy
# ... some other values ...
If you are running Kubernetes 1.29 the "sidecar" containers feature is now included by default.
As per https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/#sidecar-example the sidecar containers should go into the initContainers
space. The key variable restartPolicy
would seem to be what differentiates this "init" container from one that the main container needs to wait for.
Example flagrantly copied and pasted from above link:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: alpine:latest
command: ['sh', '-c', 'while true; do echo "logging" >> /opt/logs.txt; sleep 1; done']
volumeMounts:
- name: data
mountPath: /opt
initContainers:
- name: logshipper
image: alpine:latest
restartPolicy: Always
command: ['sh', '-c', 'tail -F /opt/logs.txt']
volumeMounts:
- name: data
mountPath: /opt
volumes:
- name: data
emptyDir: {}