I am fairly new to helm and kubernetes so I'm not sure if this is a bug or I'm doing something wrong. I have looked everywhere for answer however before posting and can't find anything that answers my question.
I have a deployment which uses a persistent volume and an init container. I pass it values to let helm know if either the image for the init container has changed, or the main application container has changed.
Possibly relevant but probably not: I need to deploy one deployment for a range of web sources (which I call collectors). I don't know if this last part is relevant, but then if I did, I probably wouldn't be here.
When I run
helm upgrade --install my-release helm_chart/ --values values.yaml --set init_image_tag=$INIT_IMAGE_TAG --set image_tag=$IMAGE_TAG
The first time everything works fine. However, when I run it a second time, with INIT_IMAGE_TAG the same, but IMAGE_TAG changed
Expected behaviour:
My values.yaml just contains a list called collectors
My template is just:
{{ $env := .Release.Namespace }}
{{ $image_tag := .Values.image_tag }}
{{ $init_image_tag := .Values.init_image_tag }}
{{- range $colname := .Values.collectors }}
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ $colname }}-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ $colname }}-ingest
labels:
app: {{ $colname }}-ingest
spec:
replicas: 1
selector:
matchLabels:
app: {{ $colname }}-ingest
template:
metadata:
labels:
app: {{ $colname }}-ingest
spec:
fsGroup: 1000
containers:
- name: {{ $colname }}-main
image: xxxxxxx.dkr.ecr.eu-west-1.amazonaws.com/main_image:{{ $image_tag }}
env:
- name: COLLECTOR
value: {{ $colname }}
volumeMounts:
- name: storage
mountPath: /home/my/dir
initContainers:
- name: {{ $colname }}-init
image: xxxxxxx.dkr.ecr.eu-west-1.amazonaws.com/init_image:{{ $init_image_tag }}
volumeMounts:
- name: storage
mountPath: /home/my/dir
env:
- name: COLLECTOR
value: {{ $colname }}
volumes:
- name: storage
persistentVolumeClaim:
claimName: {{ $colname }}-claim
---
{{ end }}
Output of helm version
: version.BuildInfo{Version:"v3.2.0-rc.1", GitCommit:"7bffac813db894e06d17bac91d14ea819b5c2310", GitTreeState:"clean", GoVersion:"go1.13.10"}
Output of kubectl version
: Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-eks-f459c0", GitCommit:"f459c0672169dd35e77af56c24556530a05e9ab1", GitTreeState:"clean", BuildDate:"2020-03-18T04:24:17Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.): EKS
Does anyone know if this is a bug or if I'm mis-using helm/kubernetes somehow?
Thanks
When you update a Deployment, it goes through a couple of steps:
The important detail here is that there is (intentionally) a state where both old and new pods are running.
In the example you show, you mount a PersistentVolumeClaim with a ReadWriteOnce access mode. This doesn't really work well with Deployments. While the old Pod is running, it owns the PVC mount, which will prevent the new Pod from starting up, which will prevent the Deployment from progressing. (This isn't really specific to Helm and isn't related to having an initContainer or not.)
There are a couple of options here:
Don't store data in a local volume. This is the best path, though it involves rearchitecting your application. Store data in a separate database container, if it's relational-type data (e.g., prefer a PostgreSQL container to SQLite in a volume); or if you have access to network storage like Amazon S3, keep things there. That completely avoids this problem and will let you run as many replicas as you need.
Use a ReadWriteMany
volume. A persistent volume has an access mode. If you can declare the volume as ReadWriteMany
then multiple pods can mount it and this scenario will work. Many of the more common volume types don't support this access mode, though (AWSElasticBlockStore and HostPath notably are only ReadWriteOnce
).
Set the Deployment strategy to Recreate
. You can configure how a Deployment manages updates. If you change to a Recreate
strategy
apiVersion: apps/v1
kind: Deployment
spec:
strategy:
type: Recreate
then the old Pods will be deleted first. This will break zero-downtime upgrades, but it will allow this specific case to proceed.