c++cgcclinkerlto

GCC - where Link Time Optimisations really apply?


I have some newbie questions related to LTO in GCC as IMHO it's kinda hard to find answers to my questions across the internet. Please consider the real use case from the image below (not very professional but it should be meaningful enough for the sake of this thread).

There are four distinct libraries in the example. For the purpose of this example let's assume every single module is compiled with appropriate lto flag (the same applies for linkers).

  1. inner.a - a static library which consists of two modules
  2. outer1.a - a static library which consists of a one module along with the inner.a library embedded into it
  3. outer2.a - a static library which consists of two modules similar to the inner.a
  4. shared.so- a shared library which comprises outer1.a and outer2.a respectively
  5. some executable which is linked against shared.so

Real use case

Where may the LTO mechanism performs some optimisations and where not? From my understanding:

What about shared.so? It's a shared object but consists of the two static libs and still full view should be preserved in order to optimise during outer1.a and outer2.a linkage.

The executable marked as yellow box is linked dynamically against the shared.so so the LTO mechanism can really do nothing and the "lucky chain" is broken there?

Also I'd really appreciate any materials on this particular subject.


Solution

  • From my understanding:
    inner.a should got optimised - it's a static library

    You appear to have an incorrect mental model of linking. inner.a is just a container of .o files. At the time of static linking individual .os are copied from it (similar to taking a book off the shelf), and linked into an executable or a shared library. It is at that stage that LTO may happen.

    What about shared.so? It's a shared object but consists of the two static

    This is wrong again. shared.so (possibly) contains code and data which came from (were copied from) the various .os, but it does not contain these objects, and it certainly doesn't contain any archive libraries.


    So let's talk about LTO. Suppose that module1.c contains this code:

    extern int foo();
    int bar() { return foo() + 1; }
    

    and module5.c has this:

    int foo() { return 42; }
    

    When module1.c is compiled, the compiler can not transform that code into

    int bar() { return 43; }
    

    because it has no idea what foo() returns.

    But when the LTO pass runs at the time of linking shared.so, that pass does know that foo() returns a constant value, and can therefore optimize bar() to also return a constant.


    The executable marked as yellow box is linked dynamically against the shared.so so the LTO mechanism can really do nothing

    Correct.


    P.S. In addition, an archive library can't really contain another archive library as your picture shows. You can put inner.a into an outer.a, but if you do, the linker will simply ignore inner.a and treat outer.a as if it only contained module3.o.