clinker

How to export dynamic symbols only for used functions?


On x86_64 linux, I'm currently using -rdynamic as a linker flag (to gcc) to export used symbols in a program so that I can trace problems when it causes SEGV, or some other places where I output the backtrace using glibc's backtrace(). This program depends on static libraries, which may contain other unused functions which depend on some more static libraries. The program and libraries are built with -ffunction-sections, and the linker has -Wl,--gc-sections -Wl,--as-needed. This combination of flags allows the program to be built without any reference errors.

At least, until binutils-2.22. Suddenly reference errors start to pop up, and I now have to included all those other libraries used in the unused functions. I suppose the linker has started marking every function in an object file as needed when at least one function is referenced. Since other linkers (gold, mold) also show the same behavior, I guess this is the correct™ behavior, and ld.bfd before binutils-2.22 were not working as intended (but working as I intended).

So now, to replicate the pre-2.22 behavior, I guess I have to specify which symbols to export manually, possibly via -W,--dynamic-list. But how do I get the list of used functions?

Minimal reproducible example

main.c

extern int give_747(void);

int main(void) {
    return give_747();
}

lib1.c (unused)

int return_123 (void) {
    return 123;
}

lib2.c

#include <execinfo.h>
#include <stdlib.h>
#include <unistd.h>
#include <backtrace.h>

extern int return_123(void);

static int cb (void *data, uintptr_t pc,
    const char *filename, int lineno, const char *function) {
    printf("%s:%d(%s)[%p]\n", filename, lineno, function, (void*)pc);
}

int give_747(void) {
    void *bt[4] = {0};
    static struct backtrace_state *bts = NULL;

    backtrace(bt, 4);
    printf("backtrace_symbols_fd():\n");
    backtrace_symbols_fd(bt, 4, STDOUT_FILENO);

    printf("\nlibbacktrace:\n");
    bts = backtrace_create_state(NULL, 0, NULL, NULL);
    backtrace_full(bts, 0, cb, NULL, NULL);
    printf("\n");

    return 747;
}

int return_246(void) {
    int tmp = return_123();
    return tmp+tmp;
}

Makefile

main: main.o lib2.a
        $(CC) -Wl,--gc-sections -Wl,--as-needed $^ -o $@ -L/path/to/libbacktrace/lib -lbacktrace

maindyn: main.o lib2.a
        $(CC) -Wl,--gc-sections -Wl,--as-needed -rdynamic $^ -o $@ -L/path/to/libbacktrace/lib -lbacktrace

%.o: %.c
        $(CC) -I/path/to/libbacktrace/include -ggdb -O3 -ffunction-sections -c $< -o $@

lib1.a: lib1.o
        ar curs $@ $^

lib2.a: lib2.o
        ar curs $@ $^

Here is the terminal output for the linking of maindyn

cc -Wl,--gc-sections -rdynamic main.o lib2.a -o maindyn -L/path/to/libbacktrace/lib -lbacktrace
lib2.a(lib2.o): In function `return_246':
/home/syukri/playground/gcsections/lib2.c:30: undefined reference to `return_123'
collect2: error: ld returned 1 exit status
make: *** [maindyn] Error 1

※ This sounds like an XY problem, where the actual problem is to produce backtrace with function names, and this -rdynamic shenanigan is just sidetracking. But I need to be able to produce the trace with SEGV, which I currently do by preloading libSegFault.so, but that thing also calls backtrace(). Otherwise, I would have just used libbacktrace.


Solution

  • Ordinarily the combination of compile option -ffunction-sections and linkage option gc-sections would make it simple to obtain a list of the used global functions defined in the program: it would consist simply of the symbols listed as defined global functions in the global symbol table of the program. Those symbols would not be dynamically exported by default, so the linker would know that they cannot be referenced from shared libraries. -ffunction-sections causes the compiler to put each function func in its own linkage section called text.func, and gc-sections causes the linker to discard each linkage section that it can prove is not used. The linker will then discard all function sections whose eponymous functions are not referenced within the executable, since they can't be referenced from outside it, and the ones that survive are the used ones.

    But your need for the -rdynamic option frustrates this good work. It causes all global symbols in the executable to be dynamically exported, making them visible from outside the executable, and thus impossible for the linker to discard. Specifically in your example, -rdynamic makes return_246 visible, making return_123 callable, forcing you to link lib1.a or have an undefined reference error.

    This snag can be dodged by building your program maindyn as per the following makefile.

    $ cat Makefile
    .PHONY: clean
    
    .DEFAULT_GOAL := maindyn 
    
    main-partial.o: main.o lib2.a
        ld -o $@ -r --gc-sections -Map=mapfile $^  --entry=main -L /usr/lib/gcc/x86_64-linux-gnu/13 -lbacktrace
    
    used_functions.ld: main-partial.o
        echo "{" > $@
        echo $$(readelf -W --syms $< | awk '$$4 == "FUNC" && $$5 == "GLOBAL" && $$6 != "HIDDEN" && $$7 != "UND"' | awk '{print $$NF ";"}') >> $@
        echo "};" >> $@
    
    maindyn: main-partial.o used_functions.ld
        $(CC) -o $@ -Wl,--dynamic-list=used_functions.ld main-partial.o 
    
    %.o: %.c
        $(CC) -ggdb -O3 -ffunction-sections -c $< -o $@
    
    lib1.a: lib1.o
        ar curs $@ $^
    
    lib2.a: lib2.o
        ar curs $@ $^
        
    clean:
        rm -f maindyn lib1.a lib2.a used_functions.ld *.o
    

    This performs two linkages, excluding lib1.a, without duplication and without coercing the admission of any undefined symbols into the program.

    First, it performs a partial linkage (-r), with -gc-sections, that as far as possible completes the linkage of main.o, lib2.a and libbacktrace, producing the single object file main-partial.o and discards unused sections. We can use gc-sections when producing a partially linked object file as long as we specify the root symbol relative to which sections must be proved unused, so we specify --entry=main. -Map=mapfile is not needed, but I've generated the linker mapfile just for its evidence.

    Next, the build extracts from main-partial.o the file used_functions.ld which lists, in the format of a linker input -dynamic-list file, all the symbols which are global functions, defined and not hidden.

    Lastly, it performs a full final linkage of maindyn to which the sole specified ELF input is main-partial.o, and to which used_functions.ld is given as a dynamic exports list, instead of -rdynamic. The executable contains no undefined symbols referenced by unused functions, because -gc-sections has discarded those from the first partial linkage.

    Including libbacktrace in the partial link, rather than the final link, puts the backtrace API into the dynamic exports file, making backtrace() and backtrace_symbols_fd() visible to their calls from libSegFault.so, which you need.

    The build runs like:

    $ make
    cc -ggdb -O3 -ffunction-sections -c main.c -o main.o
    cc -ggdb -O3 -ffunction-sections -c lib2.c -o lib2.o
    ar curs lib2.a lib2.o
    ar: `u' modifier ignored since `D' is the default (see `U')
    ld -o main-partial.o -r --gc-sections --as-needed main.o lib2.a  --entry=main -L /usr/lib/gcc/x86_64-linux-gnu/13 -lbacktrace
    echo "{" > used_functions.ld
    echo $(readelf -W --syms main-partial.o | awk '$4 == "FUNC" && $5 == "GLOBAL" && $7 != "UND"' | awk '{print $NF ";"}') >> used_functions.ld
    echo "};" >> used_functions.ld
    cc -o maindyn -Wl,--dynamic-list=used_functions.ld main-partial.o
    

    The partial linkage mapfile shows:

    ...[cut]...
    Discarded input sections
    ...[cut]...
     .text.return_246
                    0x0000000000000000       0x14 ./lib2.a(lib2.o)
    ...[cut]...
    

    Thus return_123 is no longer referenced.

    The dynamic exports file is:

    $ cat used_functions.ld 
    {
    backtrace_syminfo_to_full_callback; give_747; backtrace_pcinfo; 
    backtrace_alloc; backtrace_vector_release; backtrace_dwarf_add; 
    backtrace_syminfo_to_full_error_callback; backtrace_free; 
    backtrace_open; backtrace_close; backtrace_vector_grow; 
    backtrace_uncompress_zdebug; backtrace_create_state; backtrace_full; 
    backtrace_get_view; main; backtrace_initialize; 
    backtrace_uncompress_zstd; backtrace_qsort; backtrace_uncompress_lzma; 
    backtrace_release_view; backtrace_vector_finish; backtrace_syminfo;
    };
    

    And the program runs:

    $ ./maindyn 
    backtrace_symbols_fd():
    ./maindyn(give_747+0x36)[0x593768c6c506]
    /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7caea982a1ca]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7caea982a28b]
    ./maindyn(+0x2405)[0x593768c5d405]
    
    libbacktrace:
    /home/imk/develop/so/scrap1/lib2.c:23(give_747)[0x593768c6c552]
    

    Sidebar

    I've left your -as-needed option in place, but when you say:

    I suppose the linker has started marking every function in an object file as needed when at least one function is referenced.

    I suspect you might be misunderstanding this linker option. It has nothing to do with marking functions needed or not needed. It just tells the linker not to write a NEEDED libname.so entry into the dynamic section of the output ELF when libname.so is a shared library subsequently input to the linkage unless the output file references symbols defined in libname.so (i.e. it really needs libname.so). By default the linker will write such an entry whether symbols in libname.so are referenced or not, possibly leading the dynamic linker to load a redundant libname.so at runtime. --as-needed remains in effect for ensuing shared libraries until and unless cancelled by --no-as-needed. See man ld.

    Linux distros are divided between those that build GCC to preface --as-needed to the boilerplate libraries that it passes to the linker and those that go with the default. I don't know which sort you have. If it's one that goes with the default then --as-needed may have a desired pruning effect on the dynamic linkage of your production program and that may be why you've got it, but it has no influence on the linkage of symbols from object files, including object files that are packaged in static libraries. My Ubuntu system is one of those on which GCC passes --as-needed in its linker boilerplate. Removing -as-needed has no effect on my maindyn build and you might check if is actually necessary in a production build based on my solution, if you use it.