clinuxgccfpicposition-independent-code

Why I cannot compile with -fPIE but can with -fPIC?


I have one interesting compilation problem. At first, please see code to be compiled.

$ ls
Makefile main.c sub.c sub.h
$ gcc -v
...
gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)
## Makefile
%.o: CFLAGS+=-fPIE #[2]

main.so: main.o sub.o
    $(CC) -shared -fPIC -o $@ $^
//main.c
#include "sub.h"

int main_func(void){
    sub_func();
    subsub_func();

    return 0;
}
//sub.h
#pragma once
void subsub_func(void);
void sub_func(void);
//sub.c
#include "sub.h"
#include <stdio.h>
void subsub_func(void){
    printf("%s\n", __func__);
}
void sub_func(void){
    subsub_func();//[1]
    printf("%s\n", __func__);
}

And I compile this and got a error as below

$ LANG=en make
cc -fPIE   -c -o main.o main.c
cc -fPIE   -c -o sub.o sub.c
cc -shared -fPIC -o main.so main.o sub.o
/usr/bin/ld: sub.o: relocation R_X86_64_PC32 against symbol `subsub_func' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
make: *** [main.so] Error 1

And after this, I modified the code(removing a line [1]/using -fPIC instead of -PIE[2]) and then successfully compiled these.

$ make #[1]
cc -fPIE   -c -o main.o main.c
cc -fPIE   -c -o sub.o sub.c
cc -shared -fPIC -o main.so main.o sub.o
$ make #[2]
cc -fPIC   -c -o main.o main.c
cc -fPIC   -c -o sub.o sub.c
cc -shared -fPIC -o main.so main.o sub.o

Why did this phenomenon happened?

I have heard that calling a function within an object is done through PLT when it is compiled with -fPIC but done by jumping to the function directly when it si compiled with -fPIE. I guessed that the function call mechanism with -fPIE refrains from relocation. But I would like to know exact and accurate explanation of that.

Would you help me?

Thank you, all.


Solution

  • The only code generation difference between -fPIC and -fPIE for the code shown is in the call from sub_func to subsub_func. With -fPIC, that call goes through the PLT; with -fPIE, it's a direct call. In assembly dumps (cc -S), that looks like this:

    --- sub.s.pic   2017-12-07 08:10:00.308149431 -0500
    +++ sub.s.pie   2017-12-07 08:10:08.408068650 -0500
    @@ -34,7 +34,7 @@ sub_func:
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
    -   call    subsub_func@PLT
    +   call    subsub_func
        leaq    __func__.2258(%rip), %rsi
        leaq    .LC0(%rip), %rdi
        movl    $0, %eax
    

    In unlinked object files, it's a change of relocation type:

    --- sub.o.dump.pic  2017-12-07 08:13:54.197775840 -0500
    +++ sub.o.dump.pie  2017-12-07 08:13:54.197775840 -0500
    @@ -22,7 +22,7 @@
       1f:  55                      push   %rbp
       20:  48 89 e5                mov    %rsp,%rbp
       23:  e8 00 00 00 00          callq  28 <sub_func+0x9>
    -           24: R_X86_64_PLT32  subsub_func-0x4
    +           24: R_X86_64_PC32   subsub_func-0x4
       28:  48 8d 35 00 00 00 00    lea    0x0(%rip),%rsi        # 2f <sub_func+0x10>
                2b: R_X86_64_PC32   .rodata+0x14
       2f:  48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 36 <sub_func+0x17>
    

    And, on this architecture, when you link a shared library using cc -shared, the linker does not allow the input object files to contain R_X86_64_PC32 relocations targeting global symbols, thus the error you observed when you used -fPIE instead of -fPIC.

    Now, you are probably wondering why direct calls within a shared library are not allowed. In fact, they are allowed, but only when the callee is not a global. For instance, if you declared subsub_func with static, then the call target would be resolved by the assembler and there would be no relocation at all in the object file, and if you declared it with __attribute__((visibility("hidden"))) then you would get an R_X86_64_PC32 relocation but the linker would allow it because the callee is no longer exported from the library. But in both cases subsub_func would not be callable from outside the library anymore.

    Now you're probably wondering what it is about global symbols that means you have to call them through the PLT from a shared library. This has to do with an aspect of the ELF symbol resolution rules that you may find surprising: any global symbol in a shared library can be overridden by either the executable, or an earlier library in the link order. Concretely, if we leave your sub.h and sub.c alone but make main.c read like this:

    //main.c
    #include "sub.h"
    #include <stdio.h>
    
    void subsub_func(void) {
        printf("%s (main)\n", __func__);
    }
    
    int main(void){
        sub_func();
        subsub_func();
    
        return 0;
    }
    

    so it's now got the an official executable entry point in it but also a second definition of subsub_func, and we compile sub.c into a shared library and main.c into an executable that calls it, and run the whole thing, like this

    $ cc -fPIC -c sub.c -o sub.o
    $ cc -c main.c -o main.o
    $ cc -shared -Wl,-soname,libsub.so.1 sub.o -o libsub.so.1
    $ ln -s libsub.so.1 libsub.so
    $ cc main.o -o main -L. -lsub
    $ LD_LIBRARY_PATH=. ./main
    

    the output will be

    subsub_func (main)
    sub_func
    subsub_func (main)
    

    That is, both the call from main to subsub_func, and the call from sub_func, within the library, to subsub_func, were resolved to the definition in the executable. For that to be possible, the call from sub_func must go through the PLT.

    You can change this behavior with an additional linker switch, -Bsymbolic.

    $ cc -shared -Wl,-soname,libsub.so.1 -Wl,-Bsymbolic sub.o -o libsub.so.1
    $ LD_LIBRARY_PATH=. ./main
    subsub_func
    sub_func
    subsub_func (main)
    

    Now the call from sub_func is resolved to the definition within the library. In this case, using -Bsymbolic allows sub.c to be compiled with -fPIE instead of -fPIC, but I don't recommend you do that. There are other effects of using -fPIE instead of -fPIC, such as changing how access to thread-local storage needs to be done, and those cannot be worked around with -Bsymbolic.