I am writing an object file in assembly language to be included in a shared object. I am using the GNU toolchain, and my target is x86_64-pc-linux-gnu
. Please consider the following (example) source:
.text
.globl f
f: leaq g(%rip),%rax
ret
.data
.globl g
.protected g
g: .quad 8
The crucial parts are the global protected symbol g
and the reference to g
in f
. When I assemble the source using gcc -c -o example.o --shared -fpic example.s
, objdump -x
tells me that gas
inserted a relocation for the local reference (some relocation entry is obviously necessary):
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000003 R_X86_64_PC32 g-0x0000000000000004
The problem shows up when I try also to link the file:
$ gcc -o example.so --shared -fpic example.s
/usr/bin/ld: /tmp/ccQ6BcLl.o: relocation R_X86_64_PC32 against protected symbol `g' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
As far as I think I have understood by reading Ian Lance Taylor's blog (but I might be mistaken), this is due to the fact that the linker cannot guarantee pointer equality when symbol interposition happens (in some other object file).
As this will never be an actual problem with my symbol g
in my shared object, I would like to silence ld
. The Linux ABI 0.1 seems to say that I should add a .note.gnu.property
section to my source that contains a setting of GNU_PROPERTY_NO_COPY_ON_PROTECTED
. How can I do this in practice?
If possible, I don't want to add extra flags to the invocation of the assembler and linker, so I am looking for a solution where the necessary modifications are just part of the source file.
You cannot reference the symbol like this because while it is now protected from interposition, it is still subject to the same behaviour any object-type symbol exported from a shared library is. To fix this, go through the GOT like with any other exported object-type symbol.
The designers of the ELF ABI intended shared objects to be transparent to the main program. ELF ABI programs (but not shared objects) are wholy ignorant to the presence of shared objects and are written as if all symbols used by the program were statically linked into the program. This includes object-type symbols, to which direct access is permitted. E.g. the main program can do
movq g(%rip), %rax
and gets the value of the same variable g
your shared library uses. The way this works is that for all object-type symbols referenced by your main program but provided by a shared library, the linker looks up the size of the symbol at link time and allocates that much space in the BSS segment of the executable. The symbol (g
in this case) is resolved to point to that space.
At load time, the dynamic loader finds the shared library that defines g
and copies the data for g
from the data segment of the shared object into the space reserved in the BSS segment of the main executable and resolves the GOT entry for g
to that address. This is called a copy relocation. Thus, the shared library, when accessing g
will access the same variable the main program can access. (If the main program does not access g
, copy relocation does not take place and g
is resolved to its definition within the shared library's data/BSS segment)
However, this scheme only works if the shared library accesses the symbol through the GOT as the symbol is not relocated with the shared library. Thus, you must go through the GOT to access the symbol.
I.e. do
movq g@GOTPCREL(%rip), %rax # find the address of g
movq (%rax), %rax # load the value of g
The address of g
will not change while the program is running, so it suffices to do this once at the beginning of your code. The overhead should be low.
Workarounds include:
consider making the symbol hidden and only exposing it to the main executable through an accessor function, returning its address. You can use map files (version scripts) to set the visibility for all functions in your library in one spot, which may be easier than annotating the symbols in the source files.
if it doesn't matter if the main executable and your library see the same address for the symbol (e.g. if it's a constant), you can provide a hidden alias for the symbol and use that for internal references
you can use -Bsymbolic
to have the shared library always use its own copy of the symbol, even if it is subject to copy relocation. Be aware that this effectively disables the ability to share variables between shared library and main executable. You'll also not be able to compare function pointers for equality correctly. This option should not be used in production for this reason.
If you cannot use an accessor function for some reason, you could detour the exported symbol through a pointer, only allowing copy-relocations on the pointer:
.bss
.globl local_g
.hidden local_g
local_g:
.space 8
.data
.globl g
g: .quad local_g
In the main binary, declare g
as holding a pointer to the variable and dereference it to access the variable. Consider declariing it const
so it cannot be dereferenced by accident. Note that this approach performs worse than going through the GOT for accesses from other shared objects to the symbol.
You can use -mno-direct-extern-access
during compilation of all program parts and linking (shared library and main executable) to avoid copy relocations (you might also need to link all parts with -Wz,nocopyreloc
. Note that shared libraries compiled such are ABI-incompatible to main programs that were compiled without this option and must not be linked to them. The other way round is ok.
The best option however is to just go through the GOT as with any other global symbol.