clinuxshellunixjob-control

Background and suspended processes - Implementing a Job Control Shell in C


I'm implementing a Job Control Shell in C in Linux as a project for a Operating Systems-related subject. I have a main() function that does child process management, helped with a linked list as shown here in which background and suspended jobs information is stored:

typedef struct job_
{
    pid_t pgid; /* group id = process lider id */
    char * command; /* program name */
    enum job_state state;
    struct job_ *next; /* next job in the list */
} job;

Every time a child process exits or is stopped, a SIGCHLD is sent to parent process to be informed about that. Then, I have a signal handler as shown here that for each node of that job status linked list, checks if the process represented in that node has exited and if it did, that node is removed from the linked list. Here is the code for the SIGCHLD handler, where 'job_list' is the linked list where the info is stored:

void mySIGCHLD_Handler(int signum) {
    block_SIGCHLD();
    if (signum == 17) {
        job *current_node = job_list->next, *node_to_delete = NULL;
        int process_status, process_id_deleted;

        while (current_node) {

            /* Wait for a child process to finish.
            *    - WNOHANG: return immediately if the process has not exited
            */
            waitpid(current_node->pgid, &process_status, WNOHANG);

            if (WIFEXITED(process_status) != 0) {
                node_to_delete = current_node;
                current_node = current_node->next;
                process_id_deleted = node_to_delete->pgid;
                if (delete_job(job_list, node_to_delete)) {
                printf("Process #%d deleted from job list\n", process_id_deleted);
                } else {
                    printf("Process #%d could not be deleted from job list\n", process_id_deleted);
                }
            } else {
                current_node = current_node->next;
            }
        }
    }
    unblock_SIGCHLD();
}

The thing is, when the handler is called, some entries that should not be deleted because the process they represent are not exited, are deleted, when they shouldn't. Anyone would know why that happens?

Thank you and sorry for your lost time :(


Solution

  • I see many problems in this code, but the immediate issue is probably here:

            waitpid(current_node->pgid, &process_status, WNOHANG);
            if (WIFEXITED(process_status) != 0) {
    

    When waitpid(pid, &status, WNOHANG) returns because the process has not exited, it does not write anything to status, so the subsequent if is branching on garbage. You need to check the actual return value of waitpid before assuming status is meaningful.

    The most important other problems are: