Kernel document https://www.kernel.org/doc/gorman/html/understand/understand010.html says, that for vmalloc
-ing
It searches through a linear linked list of vm_structs and returns a new struct describing the allocated region.
Does that mean vm_struct
list is already created while booting up, just like kmem_cache_create
and vmalloc()
just adjusts the page entries? If that is the case, say if I have a 16GB RAM in x86_64 machine, the whole ZONE_NORMAL
i.e
16GB - ZONE_DMA - ZONE_DMA32 - slab-memory(cache/kmalloc)
is used to create vm_struct
list?
That document is fairly old. It's talking about Linux 2.5-2.6. Things have changed quite a bit with those functions from what I can tell. I'll start by talking about code from kernel 2.6.12 since that matches Gorman's explanation and is the oldest non-rc tag in the Linux kernel Github repo.
The vm_struct
list that the document is referring to is called vmlist
. It is created here as a struct pointer:
struct vm_struct *vmlist;
Trying to figure out if it is initialized with any structs during bootup took some deduction. The easiest way to figure it out was by looking at the function get_vmalloc_info()
(edited for brevity):
if (!vmlist) {
vmi->largest_chunk = VMALLOC_TOTAL;
}
else {
vmi->largest_chunk = 0;
prev_end = VMALLOC_START;
for (vma = vmlist; vma; vma = vma->next) {
unsigned long addr = (unsigned long) vma->addr;
if (addr >= VMALLOC_END)
break;
vmi->used += vma->size;
free_area_size = addr - prev_end;
if (vmi->largest_chunk < free_area_size)
vmi->largest_chunk = free_area_size;
prev_end = vma->size + addr;
}
if (VMALLOC_END - prev_end > vmi->largest_chunk)
vmi->largest_chunk = VMALLOC_END - prev_end;
}
The logic says that if the vmlist
pointer is equal to NULL (!NULL
), then there are no vm_struct
s on the list and the largest_chunk
of free memory in this VMALLOC
area is the entire space, hence VMALLOC_TOTAL
. However, if there is something on the vmlist
, then figure out the largest chunk based on the difference between the address of the current vm_struct
and the end of the previous vm_struct
(i.e. free_area_size = addr - prev_end
).
What this tells us is that when we vmalloc
, we look through the vmlist
to find the absence of a vm_struct
in a virtual memory area big enough to accomodate our request. Only then can it create this new vm_struct
, which will now be part of the vmlist
.
vmalloc
will eventually call __get_vm_area()
, which is where the action happens:
for (p = &vmlist; (tmp = *p) != NULL ;p = &tmp->next) {
if ((unsigned long)tmp->addr < addr) {
if((unsigned long)tmp->addr + tmp->size >= addr)
addr = ALIGN(tmp->size +
(unsigned long)tmp->addr, align);
continue;
}
if ((size + addr) < addr)
goto out;
if (size + addr <= (unsigned long)tmp->addr)
goto found;
addr = ALIGN(tmp->size + (unsigned long)tmp->addr, align);
if (addr > end - size)
goto out;
}
found:
area->next = *p;
*p = area;
By this point in the function we have already created a new vm_struct
named area
. This for loop just needs to find where to put the struct in the list. If the vmlist
is empty, we skip the loop and immediately execute the "found" lines, making *p
(the vmlist
) point to our struct. Otherwise, we need to find the struct that will go after ours.
So in summary, this means that even though the vmlist
pointer might be created at boot time, the list isn't necessarily populated at boot time. That is, unless there are vmalloc
calls during boot or functions that explicitly add vm_struct
s to the list during boot as in future kernel versions (see below for kernel 6.0.9).
One further clarification for you. You asked if ZONE_NORMAL
is used for the vmlist
, but those are two separate memory address spaces. ZONE_NORMAL
is describing physical memory whereas vm
is virtual memory. There are lots of resources for explaining the difference between the two (e.g. this Stack Overflow question). The specific virtual memory address range for vmlist
goes from VMALLOC_START
to VMALLOC_END
. In x86, those were defined as:
#define VMALLOC_START 0xffffc20000000000UL
#define VMALLOC_END 0xffffe1ffffffffffUL
For kernel version 6.0.9:
The creation of the vm_struct
list is here:
static struct vm_struct *vmlist __initdata;
At this point, there is nothing on the list. But in this kernel version there are a few boot functions that may add structs to the list:
void __init vm_area_add_early(struct vm_struct *vm)
void __init vm_area_register_early(struct vm_struct *vm, size_t align)
As for vmalloc
in this version, the vmlist
is now only a list used during initialization. get_vm_area()
now calls get_vm_area_node()
, which is a NUMA ready function. From there, the logic goes deeper and is much more complicated than the linear search described above.