armcortex-mrtosarm-mpu

Process stacks and interrupts on Cortex-M ARM cores


According to ARMv7-M and ARMv8-M reference manuals, exception stack frame is formed on currently active stack (MSP or PSP, depending on what was interrupted by the exception).

This decision looks unlogical to me: every process stack has to have a space for exception stack frame; it could be huge, especially when FPU and security extensions are used. But more importantly, it leaves at least one unanswered question: how to isolate process stack overflows from the rest of a system?

Suppose you have ARMv8-M platform (i.e. Cortex-M33) that runs unprivileged process with MPU restrictions enforced. Process has just a single MPU region for stack, and also PSPLIM register is set. Process runs near it's stack limit and the stack space is insufficient to hold exception frame.

Now some peripheral interrupt arrives. Most likely you will get an UsageFault with STKOF flag set. This is where problems start. First, you missed the exception. Most likely it is still pending and you will get it again. But how to recover?

UsageFault handling will be subject to same stack limits. There is still no space for exception frame. HardFault can ignore stack limits, but this does not make situation any better. Ignored stack limit means that memory beyond the stack is now corrupted. You could probably reserve some space after PSPLIM exactly for the HardFault, and at least you won't get corrupted memory.

Is there a safe way to deal with such situation? System should remain consistent and operational regardless of bugs (or malicious behavior) of unprivileged process.


Solution

  • TL;DR

    Stack frame is not written. You lose the context of currently executing task. Inaccessible memory is not corrupted. UsageFault (for stack limit) or MemManage (for MPU violations) is taken instead of original exception. This behavior is well-documented in ARM reference manual. Invalid stack frame is signalled with MMFSR.MSTKERR or UFSR.STKOF bits depending on the exception.

    Test program

    // Configuration defines:
    // #define TESTCASE 0                // 0, 1 or 2
    // #define ENABLE_STACK_LIMIT        // Enables SPLIM registers
    // #define ENABLE_MPU                // Enables MPU in unprivileged mode
    
    #include <stm32u5xx.h>
    #include <cstring>
    
    enum {
        MSP = 0x20001000,   // Main stack pointer
        MSS = 32,           // Main stack size
        PSP = 0x20000F80,   // Process stack pointer
        PSS = 32,           // Process stack size
        XSO = 0x800,        // Offset of stack area from MSP
        XSS = 0x1000,       // Total size of stack area
    
        FLASH_START     = 0x08000000,
        FLASH_END       = 0x08010000
    };
    
    #define EXCEPTION_STUB(func)                                                                            \
        extern "C" [[gnu::naked]] void func() {                                                             \
            __asm volatile (                                                                                \
                "ldr r0, =$0xDDCCBBAA\n"                                                                    \
                "push {r0}\n"               /* Push marker value to stack to see it in the debugger */      \
                "add sp, 4\n"               /* Restore stack pointer after push */                          \
                "bkpt\n"                                                                                    \
                "bx lr\n"                                                                                   \
                ::: "r0", "memory"                                                                          \
            );                                                                                              \
        }
    
    EXCEPTION_STUB(HardFault_Handler)
    EXCEPTION_STUB(BusFault_Handler)
    EXCEPTION_STUB(MemManage_Handler)
    EXCEPTION_STUB(UsageFault_Handler)
    EXCEPTION_STUB(SVC_Handler)
    
    int main() {
        memset((void *) (MSP - XSS), 0x00, XSS + XSO);
        memset((void *) (MSP - MSS), 0x55, MSS);
        memset((void *) (PSP - PSS), 0xAA, PSS);
    
        SCB->SHCSR = SCB_SHCSR_USGFAULTENA_Msk | SCB_SHCSR_MEMFAULTENA_Msk | SCB_SHCSR_BUSFAULTENA_Msk;
    
    #if defined(ENABLE_MPU)
        /* Regions must be 32-byte aligned to meet MPU requirements */
        static_assert(((PSP - PSS) & 0x1F) == 0);
        static_assert((PSP & 0x1F) == 0);
        static_assert((FLASH_START & 0x1F) == 0);
        static_assert((FLASH_END & 0x1F) == 0);
    
        /* Region 0: stack, RW, execute-never */
        MPU->RNR = 0;
        MPU->RBAR = (PSP - PSS) | (0b10 << MPU_RBAR_SH_Pos) | (0b01 << MPU_RBAR_AP_Pos) | MPU_RBAR_XN_Msk;
        MPU->RLAR = ((PSP - 1) & MPU_RLAR_LIMIT_Msk) | MPU_RLAR_EN_Msk;
    
        /* Region 1: flash, RO, executable */
        MPU->RNR = 1;
        MPU->RBAR = FLASH_START | (0b10 << MPU_RBAR_SH_Pos) | (0b11 << MPU_RBAR_AP_Pos);
        MPU->RLAR = ((FLASH_END - 1) & MPU_RLAR_LIMIT_Msk) | MPU_RLAR_EN_Msk;
    
        MPU->MAIR0 = 0b01000100;    // Normal memory, non-cacheable
        MPU->CTRL = MPU_CTRL_ENABLE_Msk | MPU_CTRL_PRIVDEFENA_Msk;
    #endif
    
        __set_MSP(MSP);
        __set_PSP(PSP);
    
    #if defined(ENABLE_STACK_LIMIT)
        __set_MSPLIM(MSP - MSS);
        __set_PSPLIM(PSP - PSS);
    #endif
    
        __set_CONTROL(__get_CONTROL() | CONTROL_SPSEL_Msk | CONTROL_nPRIV_Msk);
        __ISB();
    
    #if TESTCASE == 0
        /* Stack pointer stays valid in this test case */
        /* Decrement it so stack frame (32 bytes) won't fit */
        __asm volatile ("sub sp, 4");
    #elif TESTCASE == 1
        /* Stack pointer is manually adjusted to cause stack overflow */
        __asm volatile (
            "ldr r0, =$0x20000F00\n"
            "mov sp, r0\n"
            "isb\n"
            ::: "r0", "memory"
        );
    #elif TESTCASE == 2
        /* Stack pointer is corrupted upwards and placed above the original stack */
        __asm volatile (
            "ldr r0, =$0x20000FA0\n"
            "mov sp, r0\n"
            "isb\n"
            ::: "r0", "memory"
        );
    #endif
    
        __asm volatile (
            "ldr r0, =$0x44332211\n"    /* Put markers in the registers to make stack frame more visible in memory view */
            "ldr r1, =$0x88776655\n"
            "bkpt\n"                    /* Last chance to inspect state of the core */
            "svc 123\n"                 /* Trigger exception */
            "bkpt\n"                    /* Halt again if SVC has returned */
            ::: "r0", "memory"
        );
    
        return 0;
    }
    

    Implemented test cases:

    1. Simple stack overflow: SPLIM is sufficient to catch this
    2. SP is adjusted below current stack: SPLIM is sufficient to catch this. Exception is raised when SP is written (this is documented behavior too), memory access is not required.
    3. SP is adjusted above current stack. MPU is required to catch this.

    SPLIM is mostly redundant when MPU is active, but it may be useful when another MPU region is directly adjacent to stack region and MemManage is not generated.

    Both thread ("regular") stack overflow and context stacking failure set UFSR.STKOF. From handler point of view, exact stack overflow reason is not important: task context is lost anyway.

    References

    Observed behavior is documented in the following parts of ARMv8 architecture reference manual:

    1. B3.18 Exception handling

      RWBND: Preemption of current execution causes the following basic sequence:

      • R0-R3, R12, LR, RETPSR, including CONTROL.SFPA, are stacked.
      • The return address is determined and stacked.
      • <...>
      • The exception to be taken is chosen, and IPSR.Exception is set accordingly. The setting of IPSR.Exception to a nonzero value causes the PE to change to Handler mode.

      This implies that context stacking happens while PE is still in Thread mode with all security restrictions still active.

    2. B3.19 Exception entry, context stacking

      RVNSK: If one or more of the following exceptions is generated during the stacking operations on exception entry the PE is permitted to abandon any remaining stacking operations:

      • MemManage fault
      • STKOF UsageFault

      IFKBH: If a MemManage fault, BusFault, or AUVIOL SecureFault occurs on a stacking memory access during exception entry, then stacking of Additional state context is optional.

    3. B3.21 Stack limit checks

      RZLZG: On a violation of a stack limit during either exception entry or tail-chaining:

      • In a PE with the Main Extension, a synchronous STKOF UsageFault is generated. Otherwise, a HardFault is generated.
      • The stack pointer is set to the stack limit value.
      • Push operations to addresses below the stack limit value are not performed.

      IBJHX: When an instruction updates the stack pointer, if it results in a violation of the stack limit, it is the modification of the stack pointer that generates the exception, rather than an access that uses the out-of-range stack pointer.

    4. B3.24 Exceptions during exception entry

      ILBGQ: During exception entry exceptions can occur <...>, for example a MemManage fault on the push to the stack.
      <...>
      When the exception entry sequence itself causes an exception, the latter exception is a derived exception.

      RMRTR: For Derived exceptions, late-arrival preemption is mandatory.