csignalsprocess-group

pgid in signal handler is different from the real pgid


I have the following simple program that sets the main program's pgid and pgroup for STDIN. Then, I have a signal handler that prints the pgid of the current process and the pgid of the process from which the signal is sent. Here is my code

pid_t pid;

void handler(int signum, siginfo_t* siginfo, void* context){
    printf("pgid is %d, shell_pgid is %d \n", getpgid(siginfo->si_pid), pid);
}


int main()
{
    struct sigaction sa;


    sa.sa_handler = handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART; 

    sigaction(SIGINT, &sa, NULL);

    pid = getpid();
    setpgid(pid, pid);
    tcsetpgrp(STDIN_FILENO, pid);


      while(1){

      }
}

However, when I press ^C, the output I get is

^Cpgid is 335, shell_pgid is 3924 

Aren't they supposed to be the same since the program is running in the main program and the signal is also sent from the same source?


Solution

  • I think you may be a little confused about how process group IDs work.

    First, I tidied up your source:

    #define _GNU_SOURCE
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <signal.h>
    #include <string.h>
    
    pid_t pid;
    
    void
    handler (int signum, siginfo_t * siginfo, void *context)
    {
      printf ("in signal handler pid is %d, getpgid(pid) is %d \n",
              pid, getpgid (pid));
      printf
        ("in signal handler siginfo->si_pid is %d, getpgid(siginfo->si_pid) is %d \n",
         siginfo->si_pid, getpgid (siginfo->si_pid));
      exit (0);
    }
    
    
    int
    main (int argc, char **argv)
    {
      struct sigaction sa;
    
      memset (&sa, 0, sizeof (sa));
      sa.sa_sigaction = handler;
      sigemptyset (&sa.sa_mask);
      sa.sa_flags = SA_RESTART | SA_SIGINFO;
    
      sigaction (SIGINT, &sa, NULL);
    
      pid = getpid ();
      printf ("before call pgid is %d pid=%d\n", getpgid (pid), pid);
      setpgid (pid, pid);
      printf ("after setpgid call pgid is %d pid=%d\n", getpgid (pid), pid);
      tcsetpgrp (STDIN_FILENO, pid);
      printf ("after tcsetprgrp call pgid is %d pid=%d\n", getpgid (pid), pid);
    
      while (1)
        {
        }
    }
    

    The main change is that if your handler takes three parameters, you need to use SA_SIGINFO and specify the handler in sa_sigaction not sa_handler. Without that, your handler may be getting invalid second and third arguments.

    Next I fixed your handler so it printed out si_pid as well as pid.

    I also put in some additional debugging.

    This is what happens when I run straight from the shell:

    $ ./x
    before call pgid is 15136 pid=15136
    after setpgid call pgid is 15136 pid=15136
    after tcsetprgrp call pgid is 15136 pid=15136
    ^Cin signal handler pid is 15136, getpgid(pid) is 15136
    in signal handler siginfo->si_pid is 0, getpgid(siginfo->si_pid) is 15136
    

    Note that siginfo->si_pid reports as 0 because si_pid is only filled in by signals sent through kill. This means 0 is passed to getpgid() which returns the PGID of the calling process, which is unsurprisingly the same as getpgid(pid) returned on the previous line.

    Here's what happens if I kill it with kill -SIGINT from another process rather than pressing ^C.

    $ ./x
    before call pgid is 15165 pid=15165
    after setpgid call pgid is 15165 pid=15165
    after tcsetprgrp call pgid is 15165 pid=15165
    in signal handler pid is 15165, getpgid(pid) is 15165
    in signal handler siginfo->si_pid is 14858, getpgid(siginfo->si_pid) is 14858
    

    As you can see the last line reports the PID of the process sending the kill.

    In both the above examples, the PGID is already equal to the PID when the process is started. Why is that? Well, we launched one command from the command line, so there is one process group (only), so the PGID is always going to be the PID.

    So what happens if we launch a process group where we are not the first process? Try this:

    $ echo | ./x
    before call pgid is 15173 pid=15174
    after setpgid call pgid is 15174 pid=15174
    after tcsetprgrp call pgid is 15174 pid=15174
    in signal handler pid is 15174, getpgid(pid) is 15174
    in signal handler siginfo->si_pid is 14858, getpgid(siginfo->si_pid) is 14858
    

    Note this one I had to kill with kill -SIGINT because a ^C goes to the process group which is (after the PGID is changed) only echo. So, the PGID on entry is 15173 (the PID of echo) but gets changed to be 15174 (as you requested).

    I think it's all working as expected.

    I think the problem you have is in essence in your signal handler. First, you seem to be expecting si_pid to be filled in. Secondly, your printf says you are printing pgid and shell_pgid (two PGIDs, whereas actually you are printing the PGID of the process issuing the kill (or if none the result of getpgid(0) which is the PGID of the calling process), then the PID of process - i.e. both the wrong way round and a PID and a PGID. And also I suspect setting up the handler wrong may give you a junk second parameter anyway.