kuberneteskubernetes-helmkubernetes-jobs

Is there a way to enable shareProcessNamespace for helm post-install hook?


I'm running a pod with 3 containers (telegraf, fluentd and an in-house agent) that makes use of shareProcessNamespace: true.

I've written a python script to fetch the initial config for telegraf and fluentd from a central controller API endpoint. Since this is a one time operation, I plan to use helm post-install hook.

apiVersion: batch/v1
kind: Job
metadata:
  name: agent-postinstall
  annotations:
    "helm.sh/hook-weight": "3"
    "helm.sh/hook": "post-install"
spec:
  template:
    spec:
      containers:
      - name: agent-postinstall
        image: "{{ .Values.image.agent.repository }}:{{ .Values.image.agent.tag | default .Chart.AppVersion }}"
        imagePullPolicy: IfNotPresent
        command: ['python3', 'getBaseCfg.py']
        volumeMounts:
          - name: config-agent-volume
            mountPath: /etc/config
      volumes:
        - name: config-agent-volume
          configMap:
            name: agent-cm
      restartPolicy: Never
  backoffLimit: 1

It is required for the python script to check if telegraf/fluentd/agent processes are up, before getting the config. I intend to wait (with a timeout) until pgrep <telegraf/fluentd/agent> returns true and then fire APIs. Is there a way to enable shareProcessNamespace for the post-install hook as well? Thanks.

PS: Currently, the agent calls the python script along with its own startup script. It works, but it is kludgy. I'd like to move it out of agent container.


Solution

  • shareProcessNamespace

    Most important part of this flag is it works only within one pod, all containers within one pod will share processes between each other.

    In described approach job is supposed to be used. Job creates a separate pod so it won't work this way. Container should be a part of the "main" pod with all other containers to have access to running processes of that pod.

    More details about process sharing.

    Possible way to solution it

    It's possible to get processes from the containers directly using kubectl command.

    Below is an example how to check state of the processes using pgrep command. The pgrepContainer container needs to have the pgrep command already installed.

    job.yaml:

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: "{{ .Release.Name }}-postinstall-hook"
      annotations: "helm.sh/hook": post-install
    spec:
      template:
        spec:
          serviceAccountName: config-user # service account with appropriate permissions is required using this approach
          volumes:
          - name: check-script
            configMap:
              name: check-script
          restartPolicy: Never
          containers:
          - name: post-install-job
            image: "bitnami/kubectl" # using this image with kubectl so we can connect to the cluster
            command: ["bash", "/mnt/script/checkScript.sh"]
            volumeMounts:
            - name: check-script
              mountPath: /mnt/script
    

    And configmap.yaml which contains script and logic which check three processes in loop for 60 iterations per 10 seconds each:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: check-script
    data:
      checkScript.sh: | 
        #!/bin/bash
         podName=test
         pgrepContainer=app-1
         process1=sleep
         process2=pause
         process3=postgres
         attempts=0
        
       until [ $attempts -eq 60 ]; do 
         kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process1} 1>/dev/null 2>&1 \
         && kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process2} 1>/dev/null 2>&1 \
         && kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process3} 1>/dev/null 2>&1 
       
         if [ $? -eq 0 ]; then
           break
         fi
       
         attempts=$((attempts + 1))
         sleep 10
         echo "Waiting for all containers to be ready...$[ ${attempts}*10 ] s"
       done
     
       if [ $attempts -eq 60 ]; then
         echo "ERROR: Timeout"
         exit 1
       fi
     
       echo "All containers are ready !"
       echo "Configuring telegraf and fluentd services"
    

    Final result will look like:

    $ kubectl get pods
    NAME                        READY   STATUS     RESTARTS  AGE
    test                        2/2     Running    0         20m
    test-postinstall-hook-dgrc9 0/1     Completed  0         20m
    
    $ kubectl logs test-postinstall-hook-dgrc9
    Waiting for all containers to be ready...10 s
    All containers are ready !
    Configuring telegraf and fluentd services
    

    Above is an another approach, you can use its logic as base to achieve your end goal.

    postStart

    Also postStart hook can be considered to be used where some logic will be located. It will run after container is created. Since main application takes time to start and there's already logic which waits for it, it's not an issue that:

    there is no guarantee that the hook will execute before the container ENTRYPOINT