c++elfdynamic-linkingabirelocation

Where is the order in which ELF relocations are applied specified?


Consider the following two files on a Linux system:

use_message.cpp

#include <iostream>

extern const char* message;
void print_message();

int main() {
    std::cout << message << '\n';
    print_message();
}

libmessage.cpp

#include <iostream>
const char* message = "Meow!";   // 1. absolute address of string literal
                                 //    needs runtime relocation in a .so
void print_message() {
    std::cout << message << '\n';
}

We can compile use_message.cpp into an object file, compile libmessage.cpp into a shared library, and link them together, like so:

$ g++ use_message.cpp -c -pie -o use_message.o
$ g++ libmessage.cpp -fPIC -shared -o libmessage.so
$ g++ use_message.o libmessage.so -o use_message

The definition for message originally lives in libmessage.so. When use_message is executed, the dynamic linker performs relocations that:

  1. Update the message definition inside libmessage.so with the load address of the string data
  2. Copy the definition of message from libmessage.so into use_message's .bss section
  3. Update the global offset table in libmessage.so to point to the new version of message inside use_message

The relevant relocations, as dumped by readelf, are:

use_message

  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000004150  000c00000005 R_X86_64_COPY     0000000000004150 message + 0

This is relocation number 2 in list I wrote before.

libmessage.so

  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000004040  000000000008 R_X86_64_RELATIVE                    2000
000000003fd8  000b00000006 R_X86_64_GLOB_DAT 0000000000004040 message + 0

These are relocation numbers 1 and 3, respectively.

There's a dependency between relocation numbers 1 and 2: the update to libmessage.so's message definition must happen before this value is copied into use_message, otherwise use_message will not point to the correct location.

My question is: how is the order for applying relocations specified? Is there something encoded in the ELF files that specifies this? Or in the ABI? Or is the dynamic linker just expected to work out the dependencies between relocations itself and ensure that any relocations that write to a given memory address are run before any relocations that read from the same location? Does the static linker only output relocations such that the ones in the executable can always be processed after the shared library ones?


Solution

  • My question is: how is the order for applying relocations specified? Is there something encoded in the ELF files that specifies this? Or in the ABI? Or is the dynamic linker just expected to work out the dependencies between relocations itself and ensure that any relocations that write to a given memory address are run before any relocations that read from the same location? Does the static linker only output relocations such that the ones in the executable can always be processed after the shared library ones?

    I think the relocation resolving order is not specified by a standard. Dynamic loaders define an order. To support copy relocations, the main executable is relocated the last. Linkers only produce copy relocations for executable links (-no-pie/-pie) and are aware of the dynamic loader semantics.


    Quoting https://maskray.me/blog/2021-01-18-gnu-indirect-function#relocation-resolving-order:

    There are two parts: the order within a module and the order between two modules.

    glibc rtld processes relocations in the reverse search order (reversed l_initfini) with a special case for the rtld itself. The main executable needs to be processed the last to process R_*_COPY. If A has an ifunc referencing B, generally B needs to be relocated before A. Without ifunc, the resolving order of shared objects can be arbitrary.

    Let's say we have the following dependency tree.

    main
      dep1.so
        dep2.so
          dep3.so
            libc.so.6
          dep4.so
            dep3.so
            libc.so.6
        libc.so.6
      libc.so.6
    

    l_initfini contains main, dep1.so, dep2.so, dep4.so, dep3.so, libc.so.6, ld.so. The relocation resolving order is ld.so (bootstrap), libc.so.6, dep3.so, dep4.so, dep2.so, dep1.so, main, ld.so.

    Within a module, glibc rtld resolves relocations in order. Assume that both DT_RELA (.rela.dyn) and DT_PLTREL (.rela.plt) are present, glibc logic is like the following:

    // Simplified from elf/dynamic-link.h
    ranges[0] = {DT_RELA, DT_RELASZ, 0};
    ranges[1] = {DT_JMPREL, DT_PLTRELSZ, do_lazy};
    if (!do_lazy && ranges[0].start + ranges[0].size == ranges[1].start) { // the equality operator is always satisfied in practice
      ranges[0].size += size;
      ranges[1] = {};
    }
    for (int ranges_index = 0; ranges_index < 2; ++ranges_index)
      elf_dynamic_do_Rela (... ranges[ranges_index]);
    

    musl ldso/dynlink.c has:

    /* The main program must be relocated LAST since it may contain
     * copy relocations which depend on libraries' relocations. */
    reloc_all(app.next);
    reloc_all(&app);
    

    FreeBSD rtld uses a more sophisticated order, which make certain ifunc code more robust.

    $ g++ use_message.cpp -c -pie -o use_message.o
    $ g++ libmessage.cpp -fPIC -shared -o libmessage.so
    $ g++ use_message.o libmessage.so -o use_message
    

    BTW, use_message (with -fPIE relocatable files) needs copy relocations because of GCC HAVE_LD_PIE_COPYRELOC. For Clang and GCC's other architectures, the PIE modes will not lead to copy relocations.