linuxmemory-managementlinux-kernel

Which direction does memory-mapped segment of a process's virtual address space grow by default?


I'm currently going through the code that loads an ELF from disk to memory, which corresponds to the function load_elf_binary() in Linux kernel.

Such function sets up the addresses of different segments (e.g. text, data, bss, heap, stack, mmap'ed area). By tracing the code, I noticed one function: setup_new_exec(), which is defined here in /fs/exec.c. Inside such function, it calls arch_pick_mmap_layout(), which is defined here. Note that I am not targeting a specific architecture like X86, so I am referring to the generic function definition.

Below is part of the code:

if (mmap_is_legacy(rlim_stack)) {
    mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
    mm->get_unmapped_area = arch_get_unmapped_area;
} else {
    mm->mmap_base = mmap_base(random_factor, rlim_stack);
    mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}

Based on the code, I know there are two ways of obtaining the unmapped areas - bottom-up(legacy) and top-down. Such two ways are discussed in this LWN article as well.

To distinguish, we need mmap_is_legacy(), which return sysctl_legacy_va_layout;. sysctl_legacy_va_layout is initialized to be 0 by default.

Does that mean by default, the memory mapped region of a process grows from top to bottom (from high address to low address; grows from the stack to the heap)?


Solution

  • Your general assumption that "by default, the memory mapped region of a process grows from top to bottom" is correct.

    The default and legacy layouts nowadays should look like this:

    DEFAULT               LEGACY
    0xffffffffffffffff    0xffffffffffffffff   
        stack                 stack            
         🡓                     🡓              
        mmap                  ...           
         🡓                     🡑             
        ...                   heap             
        ...                   ELF
         🡑                    ...                
        heap                   🡑             
        ELF                   mmap
        ...                   ...
    0x0000000000000000    0x0000000000000000
    

    [...] the legacy layout nowadays has the mmap segment being the low address. Is there code that proves that? Besides, does the legacy memory layout nowadays start from virtual address 0?

    Sure, you can see this exactly in the code you linked, in the generic arch_pick_mmap_layout() implementation, which chooses a low mmap_base for the legacy layout. The calculation is TASK_UNMAPPED_BASE + random_factor (the random_factor comes from ASLR, see /proc/sys/kernel/randomize_va_space). Note that some architectures (namely x86, PA-RISC, PowerPC, S390, Sparc) override that function and provide their own, but the calculations that are done are pretty much the same (you can check the source code).

    That TASK_UNMAPPED_BASE represents the lower boundary for the "mmap" virtual memory area, and it varies per architecture. It should not be defined as low as 0 (zero) though. It is usually defined in terms of TASK_SIZE.

    Some examples:

    The lowest possible address mappable by userspace is actually /proc/sys/vm/mmap_min_addr and default non-zero (for example 0x10000 on x86). Such low addresses must be explicitly requested through an hint to mmap, they are not mapped voluntarily by the kernel as a result of mmap(0, ...).

    So we enstablished that for the legacy layout the "mmap" area starts at some low address, and we already know that the stack always starts at the highest address.

    As per the ELF itself, the file is merely mapped in memory by the kernel according to its type and its program headers, usually contiguously with no holes, and the calculations are the same regardless of default/legacy mmap layout. You will see multiple segments mapped with different permissions as specified in the ELF program headers (see output of readelf -l), and those segments will contain different sections, such as .text, .rodata, .bss, and so on (see output of readelf -S).

    For ELF Executables (i.e. e_type = ET_EXEC, see man 5 elf) the base virtual address is chosen by the ELF itself: it is fixed and determined at compile time, and such an ELF cannot be loaded at a different address in order for it to work.

    For ELF Shared Objects (i.e. e_type = ET_DYN), which nowadays are the norm, the base virtual address is chosen by the kernel itself and is defined by ELF_ET_DYN_BASE (adjusted if ASLR is enabled). This other answer of mine covers x86. This value is above TASK_UNMAPPED_BASE, so you will see the ELF above the "mmap" area (higher addresses) in the legacy layout, and below it (lower addresses) in the default layout.

    The "heap" area (a.k.a. the program break) by definition will start right after the ELF growing towards high addresses regardless.


    Here's a couple of annotated screenshots (click to enlarge) to show what the default vs legacy layouts look like inspecting /proc/[pid]/maps on my x86-64 machine. Note that low addresses are at the top.

    Default:

    default

    Legacy:

    legacy