I'm currently going through the code that loads an ELF from disk to memory, which corresponds to the function load_elf_binary()
in Linux kernel.
Such function sets up the addresses of different segments (e.g. text, data, bss, heap, stack, mmap'ed area). By tracing the code, I noticed one function: setup_new_exec()
, which is defined here in /fs/exec.c. Inside such function, it calls arch_pick_mmap_layout()
, which is defined here. Note that I am not targeting a specific architecture like X86, so I am referring to the generic function definition.
Below is part of the code:
if (mmap_is_legacy(rlim_stack)) {
mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
mm->get_unmapped_area = arch_get_unmapped_area;
} else {
mm->mmap_base = mmap_base(random_factor, rlim_stack);
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}
Based on the code, I know there are two ways of obtaining the unmapped areas - bottom-up(legacy) and top-down. Such two ways are discussed in this LWN article as well.
To distinguish, we need mmap_is_legacy()
, which return sysctl_legacy_va_layout;
. sysctl_legacy_va_layout
is initialized to be 0 by default.
Does that mean by default, the memory mapped region of a process grows from top to bottom (from high address to low address; grows from the stack to the heap)?
Your general assumption that "by default, the memory mapped region of a process grows from top to bottom" is correct.
The default and legacy layouts nowadays should look like this:
DEFAULT LEGACY
0xffffffffffffffff 0xffffffffffffffff
stack stack
🡓 🡓
mmap ...
🡓 🡑
... heap
... ELF
🡑 ...
heap 🡑
ELF mmap
... ...
0x0000000000000000 0x0000000000000000
[...] the legacy layout nowadays has the mmap segment being the low address. Is there code that proves that? Besides, does the legacy memory layout nowadays start from virtual address 0?
Sure, you can see this exactly in the code you linked, in the generic arch_pick_mmap_layout()
implementation, which chooses a low mmap_base
for the legacy layout. The calculation is TASK_UNMAPPED_BASE + random_factor
(the random_factor
comes from ASLR, see /proc/sys/kernel/randomize_va_space
). Note that some architectures (namely x86, PA-RISC, PowerPC, S390, Sparc) override that function and provide their own, but the calculations that are done are pretty much the same (you can check the source code).
That TASK_UNMAPPED_BASE
represents the lower boundary for the "mmap" virtual memory area, and it varies per architecture. It should not be defined as low as 0
(zero) though. It is usually defined in terms of TASK_SIZE
.
Some examples:
TASK_SIZE / 3
= 0x2aaaaaaab000
on x86-64TASK_SIZE / 3
= 0x40000000
on x86 32bit with default VMSPLIT_3G
configTASK_SIZE / 4
= 0x400000000000
on ARM64CONFIG_PAGE_OFFSET / 3
= 0x40000000
on ARM 32bit with default VMSPLIT_3G
configTASK_SIZE / 3
= 0x5555555000
on MIPS 64bit with 40 VA bitsTASK_SIZE / 8 * 3
= 0x30000000
on PowerPC 8xx (32bit)The lowest possible address mappable by userspace is actually /proc/sys/vm/mmap_min_addr
and default non-zero (for example 0x10000
on x86). Such low addresses must be explicitly requested through an hint to mmap
, they are not mapped voluntarily by the kernel as a result of mmap(0, ...)
.
So we enstablished that for the legacy layout the "mmap" area starts at some low address, and we already know that the stack always starts at the highest address.
As per the ELF itself, the file is merely mapped in memory by the kernel according to its type and its program headers, usually contiguously with no holes, and the calculations are the same regardless of default/legacy mmap layout. You will see multiple segments mapped with different permissions as specified in the ELF program headers (see output of readelf -l
), and those segments will contain different sections, such as .text
, .rodata
, .bss
, and so on (see output of readelf -S
).
For ELF Executables (i.e. e_type
= ET_EXEC
, see man 5 elf
) the base virtual address is chosen by the ELF itself: it is fixed and determined at compile time, and such an ELF cannot be loaded at a different address in order for it to work.
For ELF Shared Objects (i.e. e_type
= ET_DYN
), which nowadays are the norm, the base virtual address is chosen by the kernel itself and is defined by ELF_ET_DYN_BASE
(adjusted if ASLR is enabled). This other answer of mine covers x86. This value is above TASK_UNMAPPED_BASE
, so you will see the ELF above the "mmap" area (higher addresses) in the legacy layout, and below it (lower addresses) in the default layout.
The "heap" area (a.k.a. the program break) by definition will start right after the ELF growing towards high addresses regardless.
Here's a couple of annotated screenshots (click to enlarge) to show what the default vs legacy layouts look like inspecting /proc/[pid]/maps
on my x86-64 machine. Note that low addresses are at the top.
Default:
Legacy: