linuxlinux-kernelsignalssystem-callsseccomp

How can sigreturn block all signal except SIGKILL and SIGSTOP in SECCOMP_SET_MODE_STRICT?


In section SECCOMP_SET_MODE_STRICT of man 2 seccomp, it is said that:

Note that although the calling thread can no longer call sigprocmask(2), it can use sigreturn(2) to block all signals apart from SIGKILL and SIGSTOP.

I cannot figure out how to do this. sigreturn is a syscall that

This sigreturn() call undoes everything that was done—changing the process's signal mask, switching signal stacks (see sigaltstack(2))—in order to invoke the signal handler.

More specifically:

Using the information that was earlier saved on the user-space stack
sigreturn() restores the process's signal mask, switches stacks, and restores the process's context (processor flags and registers, including the stack pointer and instruction pointer),

The information is stored by:

The saved process context information is placed in a ucontext_t structure (see ). That structure is visible within the signal handler as the third argument of a handler established via sigaction(2) with the SA_SIGINFO flag.

I considered it to be not possible because the following 2 reasons:

  1. Since the TERM action for signal does not need to return to user space, there is no way of preventing dying by using atexit or anything like that.

    2.Although it is possible to fill out a ucontext_t with man 2 getcontext or man 3 makecontext, that won't help the process to block the signal since all the system call for installing handler and masking the signal is disabled (unless sigreturn do the siganl mask stuff itself).


Solution

  • Yes, sigreturn() indeed causes the kernel to change the signal mask for the calling thread.

    Here is how and why:

    1. The process constructs a stack frame, that looks exactly like the one the kernel does when a signal handler has just returned from a delivered signal.

      This stack frame contains the original signal mask for the thread used to deliver the signal, and the address where that thread was interrupted by the signal delivery. (In normal operation, the currently active signal mask has additional signals blocked; including the signal being delivered.)

      The process sets that signal mask to one it would prefer.
       

    2. The process calls sigreturn().

      The kernel examines the stack frame, notices the old signal mask, and reinstates that. It also cleans up the stack frame, and returns control back to the userspace code. (The stack frame contained the address of the next instruction to be executed.)
       

    3. That thread continues execution at the adress specified in that stack frame, now with its preferred signal mask installed.
       

    sigreturn() cannot be blocked by seccomp, because it is required for normal operation. (However, note that a sigreturn() call is not required after a signal is delivered; within certain limitations, the userspace process can continue execution via siglongjmp() instead. siglongjmp() is not a syscall, just an userspace function. This means that the kernel is limited to this sort of behaviour, unless it deviates from POSIX.1 behaviour.)

    The kernel does not1 differentiate between a context created by the userspace process itself and the context created by the kernel in the process stack. Because the context contains the signal mask reinstated by the kernel as part of the sigreturn() processing, sigreturn() will allow a thread to modify its signal mask.

    Note that SIGKILL and SIGSTOP are not affected, because they are the two signals the kernel enforces. All other signals should be considered as requests and notifications only, with the recipient always being able to block or ignore them.

    1 Unless signal cookies or similar sigreturn stack frame verification method is used.