I am experiencing segmentation fault when running my multi-threaded embedded application. GDB gave me a hint that the stack could be corrupt which lead me to believe the stack is too small for the problematic thread. Increasing the stack size seem to remove the issue but I would like to confirm it a bit further. What are my options here? Is it possible to find out the current stack size at the event of segfault?
In gcc compile with -fstack-usage
. That will cause the a .su file to be output for each object file which contains plain text stack report for each function. LIke:
main.c:36:6:bar 48 static
main.c:41:5:foo 88 static
main.c:47:5:main 8 static
However that reports only the stack frame for the function, the stack usage is the sum of the stack frames for each function called from that function. Working that out for all possible call paths to determine the worst case stack depth for any non-trivial application is not practical - you need a look that can inspect the call-graph and use the .su data to work that out for you. Here is an example of a perl script to combine the output of objdump
and the .su files to generate a full stack report like:
Func Cost Frame Height
------------------------------------------------------------------------
> main 176 12 4
foo 164 92 3
bar 72 52 2
> INTERRUPT 28 0 2
__vector_I2C1 28 28 1
foobar 20 20 1
R recursiveFunct 20 20 1
__vector_UART0 12 12 1
Peak execution estimate (main + worst-case IV):
main = 176, worst IV = 28, total = 204
The stack usage for your thread will be the stack usage of its entry-point/thread function, plus perhaps some margin for whatever thread overhead the OS may require.
Note that calls through function-pointers and recursion will defeat this method, so you may need to assess that separately by considering the stack usage of the functions called and the depth of recursion likely.
The answer to How to determine maximum stack usage in embedded system with gcc? may also be useful.
To help detect stack issues at runtime there are various instrumentation options related to stack checking and protection at https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Instrumentation-Options.html#Instrumentation-Options
Knowing the"current stack size" at the point you get a seg fault is not particularly helpful. It won't tell you tell you how much stack is needed, it will just tell you how far out of bound it happened to be at the point the MMU trapped the fault, which is likely to be as soon as it accesses outside of the allocated stack space, within the resolution of the page size. It just tells you your stack is not big enough - which you kind of knew already.
A "dynamic" technique for stack analysis is to "oversize" the stack, fill it with single byte value, then after running the code through a test sequence designed to exercise all likely call paths you inspect the stack region to see where the "high-tide mark" is relative to the start of the region, then "right-size" your stack accordingly. That is a common technique but depends on exercising all likely paths. Typically error and exception handling paths are omitted, so you can end up getting a stack overflow, just when your code is trying to handle some other error - its risky.