I am trying to better understand PLT and GOT.
.plt
and .got.plt
sections presented in the process memory.got.plt
sections of shared libraries. I will explain more on this point later.PLT (Procedure Linkage Table) is introduced for dynamic linking. A hooking method that exploits how linker relies on this mechanism can be called PLT hook
. Especially, one can change the (functions') symbol resolving or binding results by modifying data in the .got.plt
section.
Here is a much simplied model:
Suppose E
is your executable and L
is a (to be) linked share library for E
. When L
is loaded and the linker finishes the whole linking process for L
, the .got.plt
(could be other section with different names, but serve as the same function) is filled by values read from the .dynsym
sections of L
and its dependencies. Note that .dynsym
consists of symbols to be exported from L
and to be imported into L
When all shared libraries of E
are loaded, the linker starts to fill the .got.plt
section of E
, which enables E
to find all of its import symbols (there are no exported ones).
The above process of filling .got.plt
is called relocation, which is determined by the .rel.*
and .dynsym
sections. The relocation section indicates the address in .got.plt
to be filled with new values (i.e., addresses of functions in memory), and the new values are computed using offsets given in the .dynsym
section.
Now suppose E
calls an imported function f
from L
. Then f
appears both in the .got.plt
sections of E
and L
, which are filled with the same value, i.e., the memory address of f
. This reveals how linker reacts when an address is queried: it uses information of the ELF file to return a position, i.e., a .got.plt
entry address in memory, and in this address, the address of the queried function is stored. That is to say, the returned value from linker, for example, the result of some dlsym
call, is an integer that can be cast as a pointer to function.
Recall that the assemble codes to call f
will do the following:
.plt
section of E
,.got.plt
part of E
, read the value there, which is the address of f
in memoryf
and execute itTo do a PLT hook, one can choose to:
.got.plt
table of E
, which will hook calls of f
from E
..got.plt
table of L
, so that dlsym
and new loaded libraries will execute the hooked function..dynsym
section of L
, which can acheive similar effect of the second method, and thus will be considered the same method as 2.Both methods have their limits. The first one hooks functions imported from those already loaded libraries, but won't change the result of dlsym
. The second one cannot hook functions that are directly called using their absolute address.
To simplify more for memory, .got.plt
will appear in every ELF for a process using dynamic linking. For a given function f
in L
, a pointer is firstly defined in L
pointing to the memory address of f
(the .got.plt
entry for f
in L
is filled), then its dereference value is used to create a pointer for E
(the .got.plt
entry for f
in E
is filled). PLT hook
s can change the dereference value of pointers in E
or L
. If the pointer changed is in E
, then only calls of f
from E
will be hooked. If the pointer changed is in L
, the dlsym
will reference the hooked function and all later loaded libaries will be affected (since the pointers to f
in them are created using the dereference value we have changed).
In most usage case, the first method is applied to a shared library. That is to say, we hook functions that are imported into the target library. Both methods are called PLT hook
, and change the so called GOT (Global Offset Table). Since in the second one the function address may be stored in other sections, I personally prefer the name PLT hook
.