cdebuggingembedded

Why does a global statically initialized variable have a different value when debugging? / Why is my debugger loading different code to my device?


I am doing some embedded development practice in C, and I ran into an issue that puzzles me which suddenly popped up.

I have a variable that is initialised in main.c:

// Led colour setting
BSP_LED_HSL hsl = {
    0,
    1,
    2
};

I am not sure if it is relevant but BSP_LED_HSL is typedeffed like below: SFP_11_20 is just some signed fixed point type I created myself, and I tried to make it irrelevant to the problem by just assigning hsl = {0,1,2} as above.

typedef int32_t SFP_11_20;

typedef SFP_11_20 BSP_LED_HSL_TYPE;
typedef BSP_LED_HSL_TYPE BSP_LED_HSL[3];

The top of main.c looks like this:

int main(void) {    
    
    BSP_init(); // Lots of initialization done here, cannot remove it from the example.
    // These are simply printf macros. They get sent over UART.
    JSM_PRINTF("[%u] BSP initialized...\n", BSP_get_time_millis());
    JSM_PRINTF("HSL[0]: %i\n", hsl[0]);
    JSM_PRINTF("HSL[1]: %i\n", hsl[1]);
    JSM_PRINTF("HSL[2]: %i\n", hsl[2]);
    // Things are already wrong here, so I am snipping code here.
}

When normally running the code I get this expected result.

[0] BSP initialized... 
HSL[0]: 0 
HSL[1]: 1
HSL[2]: 2  
[1] LM initialized...

However, when debugging and breaking at the start of main, before BSP_init() hsl seems to have defaulted to 0xFFFFFFFF = -1 in all elements, and this also shows in the results. In fact, even when I break in the ResetHandler the values are wrong already. I think the debugger is showing the correct values because this is the result:

[0] BSP initialized...
HSL[0]: -1
HSL[1]: -1
HSL[2]: -1

What could possibly cause this variable to change during debugging? Normally I would expect that I made a mistake with memory management and have overwritten it somehow, but that cannot seem to be the case here.

Where should I start looking? The image I am flashing is the same when debugging, as far as I know, but is there a way to check this?

Update, overview:

Update 2:

"request": "attach" // Instead of "launch" in `launch.json`

The problem has been further narrowed down by not launching but attaching the debugger. Everything seems fixed now, but of course it still doesn't explain what happens and how to properly launch the debugger. This is the line that I changed.

EDIT:

Extra info about my toolchain:

armclang -MMD -c -g -std=c11 -DTM4C123GH6PM -DJSM_ENABLE --include-directory=C:\code\custom-cmsis -IC:\code\custom-cmsis\CMSIS-arm-default\Core\Include -I./src/bsp -I./src/jlib -I./src/features -I./src/runtime_environment -I./src/jorgOS -D__MICROLIB  --target=arm-arm-none-eabi -mcpu=cortex-m4 -mfpu=none -mfloat-abi=soft device/TM4C123GH6PM/system_TM4C123.c -o build/./device/TM4C123GH6PM/system_TM4C123.c.o        
armlink --info sizes --map --list target-linking-info --library_type=microlib --scatter=./device/TM4C123GH6PM/scatter.txt build/./src/main.c.o build/./src/bsp/bsp_led.c.o build/./src/bsp/bsp.c.o build/./src/bsp/uart.c.o build/./src/bsp/bsp_timer.c.o build/./src/features/led_manager.c.o <snip/snip> build/./device/TM4C123GH6PMC123.s.o build/startup_TM4C123.s.o build/./device/TM4C123GH6PM/system_TM4C123.c.o -o build/image/TM4C123GH6PM.axf
fromelf --bin -o build/image/TM4C123GH6PM.bin build/image/TM4C123GH6PM.axf
lmflash build/image/TM4C123GH6PM.bin -r -v -i ICDI

EDIT 2:

armclang.exe --version

Product: Keil MDK Community (non-commercial free of charge)
Component: Arm Compiler for Embedded 6.22
Tool: armclang [5ee92100]

The target is the TM4C123GH6PM chip. For debugging I use the Cortex extension in VS Code, which has the following settings. I don't really know how the debugger works or what commands it is running in the background.

{
            "name": "Cortex",
            "cwd": "${workspaceRoot}",
            "executable": "./build/image/TM4C123GH6PM.axf",
            "type": "cortex-debug",
            "request": "launch",
            "servertype": "openocd",
            "interface": "jtag",
            "device": "TM4C123GH6PM",
            "searchDir": ["C:\\tools\\openocd\\xpack-openocd-0.12.0-3\\openocd\\scripts\\board"],
            "configFiles": ["ti_ek-tm4c123gxl.cfg"],
            "svdFile": "./device/TM4C123GH6PM/TM4C123GH6PM.svd",
            "runToMain": false,
            //"cmsisPack": "${command:device-manager.getDevicePack}"
        },

EDIT 3:

Adding a canary global variable. I found out multiple global variables are corrupted when debugging.

I tried to explore what by adding uint32_t canary; in multiple scenarios and reading the value.

uint32_t canary = 0xDEADBEEF;

int main(void) {
    // Breakpoint shows canary is 0xFFFFFFFF
    canary = 0x1; // Here it correctly changes to 0x1.
// SNIP
};
uint32_t canary = 0x0;

int main(void) {
    // Breakpoint shows canary is 0x0.
    canary = 0x1; // Canary correctly changes to 0x1
// SNIP
};

EDIT 4 (printing addressing):

The following extra print statement in main.c results in &HSL[0]: 0x2000000C both when debugging and not debugging. The address is the same and agrees with the debugger and the linking information. I also saw some questions about what library I am using. I am using microlib.

// Snip
BSP_init();
JSM_PRINTF("&HSL[0]: 0x%08X\n", &hsl[0]); 
//Snip

EDIT 5:

The debugger does try to download thing to the device itself:

13-target-download
-> 13+download,{section="EXECUTION_REGION_RAM",section-size="32",total-size="96002"}
-> 13+download,{section="EXECUTION_REGION_RAM",section-sent="32",section-size="32",total-sent="32",total-size="96002"}
-> 13+download,{section="EXECUTION_REGION_ROM",section-size="17628",total-size="96002"}

Could it be that this fails, and this warning has something to do with it?

-> &"warning: Loadable section \"EXECUTION_REGION_RAM\" outside of ELF segments\n  in "
warning: Loadable section "EXECUTION_REGION_RAM" outside of ELF segments

Solution

  • Disclaimer: I'm answering my own question with a partial explanation and a work-around, because to me this solves the problem of not being able to debug. I am willing to try better solutions and investigate a bit more if that is helpful or interesting to others.

    Explanation

    When launching the debugger it takes the .axf file, and gdb at some points instructs the gdb-server to load the file into the device. This is, for an unknown reason, not done correctly and somehow results in different code on the device.

    This different code either does not contain or skips the code that loads some of the global initialised variables, resulting in 0xFFFFFFFF which is the default value in memory according to some other contributors here.

    Workaround

    To be able to debug again, we must either fix what is flashed to the device, or we must prevent the debugger from trying to load new data into the target by attaching the debugger instead of launching it. This is accomplished by altering the following line in launch.json for your debugging configuration.

    "request": "launch"
    

    must become

    "request": "attach"
    

    The real answer?

    The real answer fixes gdb/gdb-server or how we use and configure these tools. I do not know how to do that.