I have an embedded platform that brings its own stdlib functions like malloc
and printf
in a static library. I need to compile this library with LTO. Unfortunately in this combination (-flto
+ -nostdlib
+ linking with stdlib replacements from an .a
) the linker cannot find the functions.
I have prepared a MWE that should run on most Unix machines but since it contains multiple files I have put it into a repo: https://github.com/stefanct/lto_static_libs
The included makefile allows to switch on some features on and off for testing:
nostdlib=y
: add -nostdlib
to the linking stagenolto=y
: disable LTOlibfunc=y
: enable a call to a non-standard function within the library (you will see why at the end!)The gist is to have one module containing a standard function, e.g.:
int puts(const char *s) {
return 2;
}
Compiling that into an object file with -flto
, putting it into a static library with gcc-ar
and eventually using that when linking with an application.
In my setup (GCC 11 branch built from source and GNU ld 2.31.1 from Debian Buster) I get the following results:
No options set: OK - the printf
from the library gets overridden(?) by the standard function:
$ make -B
Using GCC 11.0.0
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -c -o libtest.o libtest.c
lto-dump -list libtest.o
Type Visibility Size Name
function default 4 lib_func
function default 4 puts
function default 4 printf
gcc-nm libtest.o
00000000 T lib_func
00000000 T printf
00000000 T puts
rm -f libtest.a
gcc-ar -cvq libtest.a libtest.o
a - libtest.o
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -c -o main.o main.c
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -o exe main.o -L. -ltest
$ ./exe
hurga
No stdlib but also without LTO: OK - linking works fine(ish - running segfaults but that's to be expected I guess and could be worked around with -nodefaultlibs
but I don't care here)
$ make -B nostdlib=y nolto=y
Using GCC 11.0.0
gcc -Wall -Wextra -Wno-unused-parameter -c -o libtest.o libtest.c
gcc-nm libtest.o
0000000000000074 T lib_func
0000000000000000 T printf
0000000000000065 T puts
rm -f libtest.a
gcc-ar -cvq libtest.a libtest.o
a - libtest.o
gcc -Wall -Wextra -Wno-unused-parameter -c -o main.o main.c
gcc -Wall -Wextra -Wno-unused-parameter -o exe main.o -L. -ltest -nostdlib
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
No stdlib but leaving LTO enabled: suddenly puts is no longer found.
However, as you can see, the object file that gets put into the library contains the function just fine (and evengcc-mn libtest.a
shows the same).
This is the case I would like to fix. Why is this breaking?
$ make -B nostdlib=y
Using GCC 11.0.0
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -c -o libtest.o libtest.c
lto-dump -list libtest.o
Type Visibility Size Name
function default 4 lib_func
function default 4 puts
function default 4 printf
gcc-nm libtest.o
00000000 T lib_func
00000000 T printf
00000000 T puts
rm -f libtest.a
gcc-ar -cvq libtest.a libtest.o
a - libtest.o
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -c -o main.o main.c
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -o exe main.o -L. -ltest -nostdlib
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
/usr/bin/ld: /tmp/cczvyrrg.ltrans0.ltrans.o: in function `main':
<artificial>:(.text+0xe): undefined reference to `puts'
collect2: error: ld returned 1 exit status
make: *** [makefile:39: exe] Error 1
Interestingly enough, if we call another unrelated (non-standard) function in the same library things start to work again!?
$ make -B nostdlib=y libfunc=y
Using GCC 11.0.0
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -D LIB_FUNC -c -o libtest.o libtest.c
lto-dump -list libtest.o
Type Visibility Size Name
function default 4 lib_func
function default 4 puts
function default 4 printf
gcc-nm libtest.o
00000000 T lib_func
00000000 T printf
00000000 T puts
rm -f libtest.a
gcc-ar -cvq libtest.a libtest.o
a - libtest.o
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -D LIB_FUNC -c -o main.o main.c
gcc -Wall -Wextra -Wno-unused-parameter -flto -ffat-lto-objects -D LIB_FUNC -o exe main.o -L. -ltest -nostdlib
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
Do I see a bug in binutils/ld? Is this fixed upstream?
It's a bit speculative, but I think I understand what's happening after running into the same problem.
When you allow builtins (which is default unless you specify -fno-builtins
or -ffreestanding
), GCC interprets calls to standard functions such as memcpy()
or printf()
to be __builtin_memcpy()
or __builtin_printf()
, and handles these specially for optimization purposes. For instance, memcpy()
might get inlined into a few moves for small sizes, and printf()
can be replaced with puts()
for constant formats ending in a \n
.
What this means is that a call to printf()
might end up calling the library function printf()
, but also the library function puts()
, and the compiler doesn't know until optimizations is performed. When LTO is enabled, the compiler only gets that information once LTO has run.
This presents a problem because the workflow of LTO goes like this:
In your cases without LTO, optimization is performed early so by the time you get to linking, builtins have been "resolved" and calls to e.g. puts()
are already visible, and linking proceeds normally.
But when you enable LTO, puts()
isn't part of step #1 so it gets removed, and it is only after step #2 resolves the __builtin_printf()
into puts()
that the reference appears. At this point the only mechanism available to get it is to fetch a plain puts
symbol from a non-LTO archive, which requires fat LTO objects.
Additionally, there is (at time of writing) an ld bug whereby this rescan doesn't consider fat LTO objects (see GCC bug and associated ld.bfd bug), seemingly appearing in binutils 2.27 (August 2016) and fixed in binutils 2.43 (August 2024).
My best guess at explaining your results is:
puts()
from your archive at step #3, but you had a bugged ld, and it went to fetch the system libc one.puts()
is visible before linking so fine.puts()
from your archive at step #3, but due to bug it isn't, and it cannot fall back on system libc.puts()
directly exposes it early so it's included in step #1, bypassing the need for a rescan in step #3.If that's true, then the solution is update the linker and keep using -ffat-lto-objects
.