unixshared-librariesdynamic-linkingposition-independent-codegot

How can two processes share the same Shared Library?


I've been trying to get a better grasp of how shared libraries work but I just can't rap my head around two things.

1- Each process has its own virtual memory space and page table, so If a shared library gets loaded into one process virtual memory space then how can a second process access that shared library since it's not in its memory space?

2- I understand that only the text section is shared while global data is not, how is this possible? My understanding is that each reference to a global variable is done via the Global Offset Table (GOT for short). So, if I have this line of code x = glob then this will roughly equal something like mov eax,DWORD PTR [ecx-0x10] in assembly, where ecx is used as the base value for the GOT. But if this is the case, then it is obvious that no matter which process calls that line, it will always access the same global variable whose address is at offset 0x10 in the GOT. So how can two processes have different copies of global variable, if they use the same text section that references the same GOT entry?


Solution

  • Presumably you understand page tables and copy-on-write semantics.

    Suppose you run an executable a.out, which initializes some global data, and then fork()s. You should have little trouble understanding that all read-only (e.g. code) pages of the a.out are now shared between two processes (the exact same pages of physical memory are mmaped into both virtual memory spaces).

    Now suppose that a.out also used libc.so.6 before forking. You should have no trouble understanding that the read-only pages belonging to libc.so.6 are also shared between processes in exactly the same fashion.

    Now suppose that you have two separate executables, a.out and b.out, both using libc.so.6. Suppose a.out runs first. The dynamic loader will perform a read-only mapping of libc.so.6 into a.out virtual memory space, and now some of its pages are in physical memory. At that point, b.out starts, and the dynamic loader mmap the same libc.so.6 pages into its virtual memory. Since the kernel already has a mapping for these pages, there is no reason for the kernel to create new physical pages to hold the mapping -- it can re-use previously mapped physical pages. The end result is the same as for the forked binary -- the same physical pages are shared between multiple virtual memory spaces (and multiple processes).

    So how can two processes have different copies of global variable,

    Very simple: the read-write mappings (which are required for writable data) are not shared between processes (so that one process can write to the variable, and that write will not be visible to the other process).