bashdockercontainerszombie-process

How to reap zombie process in docker container with bash


Recently I'm studying dumb-init and if I realized correctly it's trying to:

  1. runs as PID1, acting like a simple init system(reaping zombie processes)
  2. signal proxy/forwarding (which bash doesn't do)

In both here and here they all mentioned that bash is capable of reaping zombie process so I'm trying to verify this but couldn't make it work.

First of all I wrote a simple Go program which spawn 10 zombie process:

func main() {
    sigs := make(chan os.Signal, 1)

    signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM, syscall.SIGKILL)

    go func() {
        for i := 0; i < 10; i++ {
            sleepCmd := exec.Command("sleep", "1")
            _ = sleepCmd.Start()
        }
    }()

    fmt.Println("awaiting signal")
    sig := <-sigs
    fmt.Println()
    fmt.Printf("received %s, exiting\n", sig.String())
}

build a image for it:

FROM golang:1.15-alpine3.12 as builder

WORKDIR /

COPY . .

RUN go build -o main main.go

FROM alpine:3.12

RUN apk --no-cache --update add dumb-init bash

WORKDIR /
COPY --from=builder /main /
COPY --from=builder /entrypoint.sh /
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/main"]

and if I run docker run -d <image> it works as expected I can see 10 zombies process in ps:

vagrant@vagrant:/vagrant/dumb-init$ ps aux | grep sleep
root      4388  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4389  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4390  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4391  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4392  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4393  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4394  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4395  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4396  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>
root      4397  0.0  0.0      0     0 ?        Z    13:54   0:00 [sleep] <defunct>

the 2nd step is to verify bash is actually capable of reaping process, so I update my docker image ENTRYPOINT to entrypoint.sh, which just wrap my program with bash:

#!/bin/bash

/clever

if I run ps in the container the zombie processes are still hanging there:

/ # ps
PID   USER     TIME  COMMAND
    1 root      0:00 {entrypoint.sh} /bin/bash /entrypoint.sh
    7 root      0:00 /clever
   13 root      0:00 [sleep]
   14 root      0:00 [sleep]
   15 root      0:00 [sleep]
   16 root      0:00 [sleep]
   17 root      0:00 [sleep]
   18 root      0:00 [sleep]
   19 root      0:00 [sleep]
   20 root      0:00 [sleep]
   21 root      0:00 [sleep]
   22 root      0:00 [sleep]
   31 root      0:00 /bin/sh
   39 root      0:00 ps

Tried a few other way but still couldn't figure out how to reap the zombie process correctly.

thanks for the help.


Solution

  • I wrote small demo in c that can help to demonstrate that bash had reaped the zombie processes and how it would look like if he had not.

    First to explain the definition of zombie process. The zombie process is a process who had finished the work and generated an exit code. The resources are kept by the kernel waiting for the parent to collect the exit code.

    To have zombie, parent needs to ignore the child's exit (don't issue wait and ignore SIGCHLD).

    Reaping the zombies

    The following c code is creating two zombie processes. One belonging to the main process, and one that belongs to the first child.

    #include <stdio.h>
    #include <stdlib.h>
    #include <signal.h>
    #include <pthread.h>
    #include <sys/wait.h>
    #include <unistd.h>
    
    int main()
    {
        printf("Starting Program!\n");
    
        int pid = fork();
        if (pid == 0)
        {
            pid = fork(); // Create a child zombie
            if (pid == 0) {
                printf("Zombie process %i of the child process\n", getpid());
                exit(10);
            } else {
                printf("Child process %i is running!\n", getpid());
                sleep(10);  // wait 10s
                printf("Child process %i is exiting!\n", getpid());
                exit(0);
            }
        }
        else if (pid > 0)
        {
            pid = fork();
            if (pid == 0) {
                printf("Zombie process %i from the parent process\n", getpid());
            } else {
                printf("Parent process %i...\n", getpid());
                sleep(5);
                printf("Parent process will crash with segmentation failt!\n");
                int* p = 0;
                p = 10;
            }
        }
        else perror("fork()");
        exit(-1);
    }
    

    I also created a docker container that will compile the file and the child. The whole project is available in following git repository

    After running the build, and the demo, the following printout is shown in the console:

    root@d2d87f4aafbc:/zombie# ./zombie & ps -eaf --forest
    [1] 8
    Starting Program!
    Parent process 8...
    Zombie process 11 from the parent process
    Child process 10 is running!
    Zombie process 12 of the child process
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 10:43 pts/0    00:00:00 /bin/bash
    root           8       1  0 10:43 pts/0    00:00:00 ./zombie
    root          10       8  0 10:43 pts/0    00:00:00  \_ ./zombie
    root          12      10  0 10:43 pts/0    00:00:00  |   \_ [zombie] <defunct>
    root          11       8  0 10:43 pts/0    00:00:00  \_ [zombie] <defunct>
    root           9       1  0 10:43 pts/0    00:00:00 ps -eaf --forest
    root@d2d87f4aafbc:/zombie# Parent process will crash with segmentation failt!
    ps -eaf --forest
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 10:43 pts/0    00:00:00 /bin/bash
    root          10       1  0 10:43 pts/0    00:00:00 ./zombie
    root          12      10  0 10:43 pts/0    00:00:00  \_ [zombie] <defunct>
    root          13       1  0 10:43 pts/0    00:00:00 ps -eaf --forest
    [1]+  Exit 255                ./zombie
    root@d2d87f4aafbc:/zombie# Child process 10 is exiting!
    ps -eaf --forest
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 10:43 pts/0    00:00:00 /bin/bash
    root          14       1  0 10:43 pts/0    00:00:00 ps -eaf --forest
    

    The main process (PID 8) creates two children.

    After the creation of the processes, the parent process will sleep for 5s and create segmentation fault, leaving the zombies.

    When the main process dies, the PID 11 is inherited by bash and it's cleaned up (reaped). PID 10 is still working (sleeping is a kind of work for a process) he is left alone by bash, since PID 11 had not invoked wait, the PID 12 is still zombie.

    After 5 seconds, PID 11 had finished sleeping and exited. Bash had reaped and inherited PID 12 after which bash had reaped PID 12

    Leaving zombies

    The other c application is just executing the bash as a child process, leaving it to be the PID 1, and he will ignore the zombies.

    # docker run -ti --rm test /zombie/ignore
    root@b9d49363cb57:/zombie# ./zombie & ps -eaf --forest
    [1] 10
    Starting Program!
    Parent process 10...
    Zombie process 13 from the parent process
    Child process 12 is running!
    Zombie process 14 of the child process
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 11:18 pts/0    00:00:00 /zombie/ignore
    root           7       1  0 11:18 pts/0    00:00:00 sh -c /bin/bash
    root           8       7  0 11:18 pts/0    00:00:00  \_ /bin/bash
    root          10       8  0 11:18 pts/0    00:00:00      \_ ./zombie
    root          12      10  0 11:18 pts/0    00:00:00      |   \_ ./zombie
    root          14      12  0 11:18 pts/0    00:00:00      |   |   \_ [zombie] <defunct>
    root          13      10  0 11:18 pts/0    00:00:00      |   \_ [zombie] <defunct>
    root          11       8  0 11:18 pts/0    00:00:00      \_ ps -eaf --forest
    root@b9d49363cb57:/zombie# pParent process will crash with segmentation failt!
    ps -eaf --forest
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 11:18 pts/0    00:00:00 /zombie/ignore
    root           7       1  0 11:18 pts/0    00:00:00 sh -c /bin/bash
    root           8       7  0 11:18 pts/0    00:00:00  \_ /bin/bash
    root          15       8  0 11:18 pts/0    00:00:00      \_ ps -eaf --forest
    root          12       1  0 11:18 pts/0    00:00:00 ./zombie
    root          14      12  0 11:18 pts/0    00:00:00  \_ [zombie] <defunct>
    root          13       1  0 11:18 pts/0    00:00:00 [zombie] <defunct>
    [1]+  Exit 255                ./zombie
    root@b9d49363cb57:/zombie# Child process 12 is exiting!
    ps -eaf --forest
    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0  0 11:18 pts/0    00:00:00 /zombie/ignore
    root           7       1  0 11:18 pts/0    00:00:00 sh -c /bin/bash
    root           8       7  0 11:18 pts/0    00:00:00  \_ /bin/bash
    root          16       8  0 11:18 pts/0    00:00:00      \_ ps -eaf --forest
    root          12       1  0 11:18 pts/0    00:00:00 [zombie] <defunct>
    root          13       1  0 11:18 pts/0    00:00:00 [zombie] <defunct>
    root          14       1  0 11:18 pts/0    00:00:00 [zombie] <defunct>
    root@b9d49363cb57:/zombie#
    

    So now, we have 3 zombies left in the system, hanging.