I have a simple C program which behaves differently when debugged with gdb and not. The program is this:
#include <stdio.h>
#include <signal.h>
int main() {
kill(getpid(), SIGFPE);
printf("I'm happy.\n");
return 0;
}
When run by itself, I get this very strange result:
ezyang@javelin:~$ ./mini I'm happy. ezyang@javelin:~$ echo $? 0
No error! That is not to say that the signal is not being fired, it is:
ezyang@javelin:~$ strace -e signal ./mini kill(31950, SIGFPE) = 0 --- SIGFPE (Floating point exception) @ 0 (0) --- I'm happy
When in GDB, things proceed differently:
ezyang@javelin:~/Dev/ghc-build-sandbox/libraries/unix/tests/libposix$ gdb ./mini GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini...(no debugging symbols found)...done. (gdb) r Starting program: /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000 Program received signal SIGFPE, Arithmetic exception. 0x00007ffff7a49317 in kill () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) c Continuing. Program terminated with signal SIGFPE, Arithmetic exception. The program no longer exists.
Asking GDB to not stop makes no difference
(gdb) handle SIGFPE nostop Signal Stop Print Pass to program Description SIGFPE No Yes Yes Arithmetic exception (gdb) r Starting program: /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000 Program received signal SIGFPE, Arithmetic exception. Program terminated with signal SIGFPE, Arithmetic exception. The program no longer exists.
What's going on?! For one thing, why isn't the SIGFPE killing the program; for the second thing, why is GDB behaving differently?
Update. One thought is that the child process is inheriting the signal masks of the parent. However, as can be seen in this transcript, that clearly is not the case: This analysis was not correct, see below.
ezyang@javelin:~$ trap - SIGFPE ezyang@javelin:~$ ./mini I'm happy.
Update 2. A friend of mine points out that trap only reports signals as set by the shell itself, and not by any parent processes. So we tracked down the ignore masks of all the parents, and lo and behold, rxvt-unicode had SIGFPE masked. A friend confirmed he could reproduce when he ran the executable using rxvt-unicode.
Ignored signals are inherited across fork()
and exec*()
:
$ ./mini Floating point exception (core dumped) $ trap '' SIGFPE $ ./mini I'm happy. $ trap - SIGFPE $ ./mini Floating point exception (core dumped)
I discussed this privately with the question author. Debugging was complicated by the fact that bash saves and restores the signal mask from its parent process, and that the trap
builtin only reports signals that were handled or ignored in the current shell, even though ignored signals inherited from the parent process will still take effect.
It turns out the root problem was that he was running the test inside urxvt, which links libperl, which unconditionally ignores SIGFPE.