kubernetesopenshiftreadinessprobelivenessprobe

Kubernetes postStart hook leads to race condition


I use a MySQL on Kubernetes with a postStart hook which should run a query after the start of the database.

This is the relevant part of my template.yaml:

    spec:
      containers:
        - name: ${{APP}}
          image: ${REGISTRY}/${NAMESPACE}/${APP}:${VERSION}
          imagePullPolicy: Always
          lifecycle:
            postStart:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - hostname && sleep 12 && echo $QUERY | /opt/rh/rh-mysql80/root/usr/bin/mysql
                    -h localhost -u root -D grafana
                    -P 3306
          ports:
            - name: tcp3306
              containerPort: 3306
          readinessProbe:
            tcpSocket:
              port: 3306
            initialDelaySeconds: 15
            timeoutSeconds: 1
          livenessProbe:
            tcpSocket:
              port: 3306
            initialDelaySeconds: 120
            timeoutSeconds: 1

When the pod start, the PVC for the database gets corruped and the pod crashes. When I restart the pod, it works. I guess the query runs, when the database is not up yet. I guess this might get fixed with the readinessprobe, but I am not an expert at these topics.

Did anyone else run into a similar issue and knows how to fix it?


Solution

  • Note that postStart will be call at least once but may also be called more than once. This make postStart a bad place to run query.

    You can set pod restartPolicy: OnFailure and run the query in separate MYSQL container. Start your second container with wait and run your query. Note that your query should produce idempotent result or your data integrity may breaks; consider when the pod is re-create with the existing data volume.