In GDB, I can do the following and get address of a string literal:
(gdb) p &"aaa"
$3 = (char (*)[4]) 0x614c20
As I understand, a string literal is a rvalue with no symbol to which it binds, but why can I get its address?
What kind of address is it? Is there somewhere that all string literals are stored? But there are infinite string literals, does it mean the string literal is "created and put in memory" on-demand?
GDB calls malloc
inside the debuggee to allocate space for the string literal, then writes the string literal to the freshly allocated memory chunk and returns the address of it.
It achieves this by building a fake stack frame which calls into malloc() and which is set up to return to a breakpoint instruction. It then silently resumes the program, letting the call to malloc complete, and then retakes control of the program when the program returns to the breakpoint.
You can prove that this is happening by setting a breakpoint on malloc
itself, then running print &"foo"
. GDB will hit the breakpoint with the following message:
The program being debugged stopped while in a function called from GDB.
Evaluation of the expression containing the function
(malloc) will be abandoned.
When the function is done executing, GDB will silently stop.
If you’re curious, you can check the value of the first argument ($rdi on x86-64) and verify that it’s equal to 4 (length of the string literal + 1). You can also check the stack trace to see the fake breakpoint return address.
This behaviour might be surprising; after all, one doesn’t normally expect print &"foo"
to execute code in the debuggee. If you want to disable this behaviour, run set may-call-functions off
. After running this, an expression like print &"foo"
will yield Cannot call functions in the program: may-call-functions is off.
. print
statements that don’t require executing debuggee functions will continue to work normally.