clinuxshellsigchld

Unable to getpgid() for a pid inside sigchld_handler


In the shell I am developing, I execute a set of commands A | B | C by forking children to execute each child in the pipe. The 3 children all have the same PGID as that of the first child. That is, the 3 children with PID x, y, z have PGID = x. The execution of all the 3 commands run perfectly. In the SIGCHLD signal handler sigchld_handler() I wait count the number of children terminated and once it is 3, I get the PGID to get the job data to remove from the JobList. However, the function getpgid() returns -1 for all the 3 PIDs x, y, z. i.e. getpgid(x), getpgid(y), getpgid(z) all return -1 with errno 3 (ESRCH).

While setting the pgid to the children using setpgid() in the parent process, getpgid() worked perfectly fine and returned x. This problem occurs only in the signal handler. Can you please guide me to get the pgid of the pid in the signal handler?

Here is the signal handler code:

void sigchld_handler(int s) {

    \\declarations
    pid_t pid, pgid;
    .
    .
    .

    while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
        pgid = getpgid(pid);   // pgid = -1, but should return x.
        .
        .
        .
    }
}

while in main(), in the parent process, after I do:

.
.
setpgid(x, x);
setpgid(y, x);
setpgid(z, x);
.
.


getpgid(x) returns x
getpgid(y) returns x
getpgid(z) returns x

Any help is greatly appreciated.

Thanks.


Solution

  • SIGCHLD is a signal you get when a child process terminates. It sort of makes sense you wouldn't be able to request PGID for a dead process... Note that you only run it after you've already reaped the process with waitpid, so the system fails finding the requested PID to extract a PGID from it.

    The error you get (3) is ESRCH:

    #define ESRCH        3  /* No such process */
    

    Which only strengthens this point - the PID is no longer valid. I recommend you create an internal mapping from PID to GID and lookup internally in your process.