cgcclinkerruntime

Resolving references at link time or run time?


Friends, I have two files, a.c and b.c . I have defined a function foo in a.c which is being called from b.c.

From what I understand, when the compiler tries to compile b.c, it will see that the implementation of foo is not in b, so it will add a entry for foo in the symbol table which os to be resolved at the linking time. I understood this concept properly.

Now, i have a different function printf in b.c which is implmented in glibc. From what i understand, printf can be linked at the loadtime or at the run time. If printf will be linked at the run time, there must have a stub for each call to printf which will be resolved at the run time using a system call.

my question is "Is my understanding correct ??? + how does the compiler determine that a function foo would be resolved by linker and not at run time ???"

i noticed some similar questions but could not understand their significance here ???


Solution

  • I find your question slightly hard to read, so I'm not quite sure how you understand it, so I'll just describe how it works.

    1. If the symbol is in the same file (b.c) then the compiler refers to it directly. The linker is not used to resolve anything.

    2. If the symbol is not in the same file, and -fPIC is not specified, then the compiler simply emits a call to an undefined symbol. In this case the linker will search for the symbol in other .o files, or in libraries, and insert the direct reference at link time by basically pasting it into the blank.

      This is exactly how you would normally build a program (as opposed to a library). If the program uses dynamic libraries then there may be some symbols that cannot be fixed up at link time. If so, the linker will check that the library does have them and it will be left to the dynamic linker to finish the job at run time.

      It would be possible to do exactly this in shared libraries also, only with the dynamic linker always pasting the addresses into the program at run time, but to do that would mean that the shared library couldn't be shared: each program would have to have its own copy with its own fix-ups. This is why that doesn't happen.

    3. If the symbol is not in the same file, and -fPIC is specified, then the compiler does not use the symbol name directly. Instead, it calls functions via a PLT (Procedure Linkage Table) and gets the address of other symbols via a GOT (Global Offset Table).

      The GOT is a special table created by the linker, and it is basically just a list of undefined symbol references similar to the ones you'd find in a regular non-PIC program (except that they're typically offsets to the base of the GOT). The dynamic linker fills in the blanks at run time. The compiler arranges for the address of the GOT to be always in a particular CPU register so the table can always be found.

      The PLT is a set of trampolines created by the linker. The compiler creates jumps into the PLT, and the dynamic linker sets up the PLT to bounce on to the real location of the function. Actually, in many cases the PLT is not filled in by the dynamic linker when the library is loaded: the PLT fills itself in the first time it is called by using the GOT (it's self-modifying code).

      This is why dynamic libraries are normally built with -fPIC: the GOTs and PLTs can be modified for each program while still keeping the text of the libraries unmodified, and therefore allowing them to remain shared.

    So, now the answers to your questions:

    I think the 'stub' that you talked about might be the PLT?

    The compiler does not know when a function will be resolved. It only knows that it can't resolve it itself. In fact, when you use dynamic libraries, the linker does not even try to resolve symbols fully (although I think it does check that they are defined in the library); this means that it's possible to override a specific function in a library by providing another function with the same name. Tools like tsocks use this with LD_PRELOAD to intercept library calls.