linuxdockerkubernetesweavecontainerd

Creating a sibling process to a container using containerd


I have a Kubernetes cluster (Docker and containerd) where I deployed the Weave CNI plugin.

When inspecting the master node processes (ps -aef --forest) I can see that the containerd-shim process that runs the weave plugin has 3 processes in it's tree:

31175  16241 \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/836489.. -address /run/containerd/contai
31199  31175 |   \_ /bin/sh /home/weave/launch.sh
31424  31199 |   |   \_ /home/weave/weaver --port=6783 --datapath=datapath --name=36:e4:33:8
31656  31175 |   \_ /home/weave/kube-utils -run-reclaim-daemon -node-name=ubuntu -peer-name=36:e4

What I fail to understand is how the kube-utils process (pid 31656), which is issued from the launch.sh script process (pid 31199) is a sibling process of it and not a child process?

I have tried to create a similar environment to emulate this scenario, by creating a docker image from the following:

FROM ubuntu:18.04
ADD ./launch.sh /home/temp/
ENTRYPOINT ["/home/temp/launch.sh"]

Where launch.sh in my case is similar in the idea to that of weave:

#!/bin/sh

start() {
    sleep 2000&
}

start &

sleep 4000

After deploying this to the cluster I get the following process tree:

114944  16241 \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/d9a6904 -address /run/containerd/contai
114972 114944     \_ /bin/sh /home/temp/launch.sh
115002 114972         \_ sleep 4000
115003 114972         \_ sleep 2000

And you can see that both processes are children of the main container process and not a sibling.

According to the weave scenario above, I would expect that the sleep 2000 process would be a sibling to the launch.sh process and not a child.

Any idea how to explain the weave situation above? how can I reproduce this locally? or in what scenario is a sibling process created to the container process?

Thank you all.


Solution

  • According to the weave scenario above, I would expect that the sleep 2000 process would be a sibling to the launch.sh process and not a child.

    I reproduced the setup you were having and encountered similar situation (one of the sleep command was not a sibling to launch.sh). To achieve that you will need following parameters in your Deployment or Pod YAML:

    You can read more about hostPid here:

    You can read more about securityContext here:


    It's working with Weave as it's having parameters mentioned above. You can look them up here:

    Also this processes are running by:


    Example

    This is an example to show how you can have a setup where the sleep command will be a sibling to launch.sh.

    The process can differ:

    launch.sh file:

    #!/bin/bash
    start() {
        sleep 10030 &
    }
    start &
    ( sleep 10040 &)
    sleep 10050 &
    /bin/sh -c 'sleep 10060'
    

    Using ConfigMap with a script as an entrypoint

    You can use above script to create a configMap which will be used to run a pod:

    Pod YAML definition:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        run: bashtest
      name: bashtest
    spec:
      containers:
      - image: ubuntu
        name: bashtest
        command: ["/mnt/launch.sh"]
        resources: {}
        securityContext:
           privileged: true
        volumeMounts:
        - mountPath: /mnt/launch.sh
          name: ep
          subPath: launch.sh
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      hostPID: true
      volumes:
        - name: ep
          configMap:
            defaultMode: 0750
            items:
            - key: launch.sh
              path: launch.sh
            name: entrypoint
    

    Building an image with all the files included

    You can also build an image. Please remember that this image is only for example purposes.

    Dockerfile:

    FROM ubuntu:18.04
    ADD ./launch.sh /
    RUN chmod 777 ./launch.sh
    ENTRYPOINT ["/launch.sh"]
    

    Pod YAML definition:

    apiVersion: v1
    kind: Pod
    metadata:
      name: process
      labels:
        app: ubuntu
    spec:
      containers:
      - image: gcr.io/dkruk-test-00/bashtest
        imagePullPolicy: Always
        name: ubuntu
        securityContext:
           privileged: true
      hostPID: true
      restartPolicy: Always
    

    After applying the manifest for this resources (either with built image or with a ConfigMap), you should be able to run (on a node that is running this Pod):

    and see the output similar to this (only part):

    root     2297272     290  0 09:44 ?        00:00:00  \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/5c802039033683464d5a586
    root     2297289 2297272  0 09:44 ?        00:00:00      \_ /bin/bash /launch.sh
    root     2297306 2297289  0 09:44 ?        00:00:00      |   \_ sleep 10050
    root     2297307 2297289  0 09:44 ?        00:00:00      |   \_ /bin/sh -c sleep 10060
    root     2297310 2297307  0 09:44 ?        00:00:00      |       \_ sleep 10060
    root     2297305 2297272  0 09:44 ?        00:00:00      \_ sleep 10040
    root     2297308 2297272  0 09:44 ?        00:00:00      \_ sleep 10030