cmultiprocessingforkzombie-processwaitpid

Why is the child process a zombie after kill() it


I have a multi-processes program. To briefly illustrate the problem, the child process will be only blocked and the main process judge whether the child process still exists, if exists the parent process kill the child process.

My codes are as below:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <time.h>
#include <errno.h> 
#include <sys/socket.h> 
#include <string.h>

#define TIME_OUT 3 

int get_current_time()
{
    struct timespec t;
    clock_gettime(CLOCK_REALTIME, &t);
    return t.tv_sec;
}

void child_process_exec() 
{
    int fd = open("./myfifo", O_WRONLY); // myfifo is a named pipe, here will be blocked.
    sleep(10);
}

void parent_process_exec(pid_t workProcessId)
{
    int status;
    int childRes; 
    int lastHeartBeatTime = get_current_time();
    while(1) {
        sleep(1);
        if (get_current_time() - lastHeartBeatTime> TIME_OUT) {
            childRes = waitpid(workProcessId, &status, WNOHANG);
            if(childRes == 0) {  
                printf("kill process\n"); 
                printf("kill get %d\n", kill(workProcessId, SIGTERM));
            }
            
            workProcessId = fork();
            if(workProcessId > 0) {
                lastHeartBeatTime = get_current_time();
            } else {
                printf("start up child process again\n");
                child_process_exec();
                return;
            }
        }
    }
}

int main()
{
    pid_t workProcessId = fork();
    if (workProcessId > 0) {
        parent_process_exec(workProcessId);
    } else {
        child_process_exec();
    }
    
    return 0;
}

But I use ps get the child process is <defunct> in the terminal. Why is the child process a zombie after kill() it? How can I kill the child process cleanly?


Solution

    1. At t+3s you call waitpid(..., WNOHANG) which immidiately returns without reaping the child as is evident by childRes == 0. You kill the first child then overwrite workProcessId with pid of the 2nd child. Rinse and repeat. This means waitpid() is never called after a child has terminated, and at t=T you end up with T/3 zombie child processes. The easiest fix would be to change WNOHANG to 0 so parent blocks waiting for child. You would get similar effect by just using wait() to block waiting for any child.

      Alternatively, maintain an array of pid_t to hold each of the children that haven't been reaped then. Then loop that array with waithpid(..., WNOHANG).

    2. You probably want to fix the logic in parent_process_exec() so it doesn't unconditionally fork a new child.

    3. On Linux, I had to include signal.h for kill().

    4. Change int workProcessId to pid_t workProcessId.

    5. The 2nd argument to open() is an int not a string so you want to use O_WRONLY not "O_WRONLY". Always check return values.