embeddedmbedutestunity-test-framework

DeepSleepLock underflow error when doing pow(2, ((m - 69.0f) / 12.0f)) - MBed OS


I'm using MBed OS on an NUCLEO_L432KC and the MBed CLI to compile, flash, and test. Using OpenOCD and gdb to debug. MBed has their own GreenTea test automation tool for unit testing on the embedded hardware and it used the utest and Unity testing frameworks.

When I use GreenTea to unit test this function:

float Piano::midiNumToFrequency(uint8_t m)
{
    float exp = (m - 69.0f) / 12.0f;
    return pow(2, exp);
}

I get a DeepSleepLock underflow error:

[1589410046.26][CONN][RXD] ++ MbedOS Error Info ++ [1589410046.30][CONN][RXD] Error Status: 0x80040124 Code: 292 Module: 4 [1589410046.35][CONN][RXD] Error Message: DeepSleepLock underflow (< 0) [1589410046.37][CONN][RXD] Location: 0x8003B09 [1589410046.40][CONN][RXD] File: mbed_power_mgmt.c+197 [1589410046.43][CONN][RXD] Error Value: 0xFFFF [1589410046.53][CONN][RXD] Current Thread: main Id: 0x20001200 Entry: 0x80044A7 StackSize: 0x1000 StackMem: 0x20001C18 SP: 0x2000FF04 [1589410046.62][CONN][RXD] For more info, visit: https://mbed.com/s/error?error=0x80040124&tgt=NUCLEO_L432KC [1589410046.64][CONN][RXD] – MbedOS Error Info –

Yet when I change the function to this:

float Piano::midiNumToFrequency(uint8_t m)
{
    float exp = (m - 69.0f);
    return pow(2, exp);
}

it works and tests fine.

MBed has an error status decoder here which says

Use the "Location" reported to figure out the address of the location which caused the error or try building a non-release version with MBED_CONF_PLATFORM_ERROR_FILENAME_CAPTURE_ENABLED configuration enabled to capture the filename and line number where this error originates from.

When I enable the MBED_CONF_PLATFORM_ERROR_FILENAME_CAPTURE_ENABLED, it says the location is in mbed_power_mgmt.c line 197 which is the functoin:

/** Send the microcontroller to sleep
 *
 * @note This function can be a noop if not implemented by the platform.
 * @note This function will be a noop in debug mode (debug build profile when MBED_DEBUG is defined).
 * @note This function will be a noop if the following conditions are met:
 *   - The RTOS is present
 *   - The processor turn off the Systick clock during sleep
 *   - The target does not implement tickless mode
 *
 * The processor is setup ready for sleep, and sent to sleep using __WFI(). In this mode, the
 * system clock to the core is stopped until a reset or an interrupt occurs. This eliminates
 * dynamic power used by the processor, memory systems and buses. The processor, peripheral and
 * memory state are maintained, and the peripherals continue to work and can generate interrupts.
 *
 * The processor can be woken up by any internal peripheral interrupt or external pin interrupt.
 *
 * @note
 *  The mbed interface semihosting is disconnected as part of going to sleep, and can not be restored.
 * Flash re-programming and the USB serial port will remain active, but the mbed program will no longer be
 * able to access the LocalFileSystem
 */
static inline void sleep(void)
{
#if DEVICE_SLEEP
#if (MBED_CONF_RTOS_PRESENT == 0) || (DEVICE_SYSTICK_CLK_OFF_DURING_SLEEP == 0) || defined(MBED_TICKLESS)
    sleep_manager_sleep_auto();
#endif /* (MBED_CONF_RTOS_PRESENT == 0) || (DEVICE_SYSTICK_CLK_OFF_DURING_SLEEP == 0) || defined(MBED_TICKLESS) */
#endif /* DEVICE_SLEEP */
}

Any ideas why this is happening or how to troubleshoot further?


Solution

  • This part:

    StackSize: 0x1000 
    StackMem: 0x20001C18 
    SP: 0x2000FF04
    

    Suggests that the stack pointer is no longer within the task's own stack.

    The cause of that cannot really be determined from just the code posted, but the reported location is irellevant; when a function pops a return address from a corrupted stack or using a corrupted stack pointer, the program-counter could end up anywhere or nowhere.

    It is possible for example that your test thread has insufficient stack allocation and the overflow has corrupted the stack or TCB of some other thread that is then crashing. That kind of behaviour could lead to the kind of error you are seeing where the code indicated is unrelated to the source of the error. That is purely speculation however, there are other error mechanisms such as buffer-overrun that might cause similar non-deterministic behaviour.

    The critical thing to understand is that just because modifying this function appears to affect the result does not suggest that this function is itself at fault.