linuxmemory-managementlinux-kernelvirtual-memorypage-tables

Do kernel threads have their own page table in Linux?


User-space processes have their own page table created by the kernel.
I understood that kernel threads do not create virtual memory space for them. I have two questions:

  1. Do kernel threads have their own page table in Linux?
  2. Does kernel maintains a page table exclusively for kernel functioning? (Note: Not asking about the process in kernel mode)

Solution

    1. Do kernel threads have their own page table in Linux?

    No, they don't. You can see this in the latest Linux version 6.3 in the context_switch() function here:

    /*
     * kernel -> kernel   lazy + transfer active
     *   user -> kernel   lazy + mmgrab() active
     *
     * kernel ->   user   switch + mmdrop() active
     *   user ->   user   switch
     */
    if (!next->mm) {                                // to kernel
        enter_lazy_tlb(prev->active_mm, next);
    
        next->active_mm = prev->active_mm;
        if (prev->mm)                           // from user
            mmgrab(prev->active_mm);
        else
            prev->active_mm = NULL;
    }
    

    When context switching, the next task may not have an mm struct. This is the case for kernel threads, as they do not have their own virtual address space. The mm struct normally contains the pgd, page global directory, which is the top level page for that task's page table(s). But since kthreads do not have their own mm struct, they have no kthread specific page table.

    Instead, as the code above says, switching from kernel task to kernel task or switching from user space to kernel (thread) involves a lazy tlb switch. Both cases inherit the previously active mm struct and (therefore) page table. The only difference is when switching from user to kernel, you must mmgrab the mm struct to make sure it isn't deleted while the kernel thread is using it.

    The reason kernel threads can get away with using any previous task's page tables is that all user-processes' page tables have the same kernel space mappings in their PGDs.* This is a space optimization (less memory required for page tables), but mostly a performance optimization.

    The enter_lazy_tlb function comment gives more insight:

    /*
     * Please ignore the name of this function.  It should be called
     * switch_to_kernel_thread().
     *
     * enter_lazy_tlb() is a hint from the scheduler that we are entering a
     * kernel thread or other context without an mm.  Acceptable implementations
     * include doing nothing whatsoever, switching to init_mm, or various clever
     * lazy tricks to try to minimize TLB flushes.
     *
     ...
     */
    

    The goal of doing "lazy tlb" is to minimize TLB flushes during switches to kernel threads by taking advantage of the property that they can use any active mm struct for their page tables.

    1. Does kernel maintains a page table exclusively for kernel functioning?

    If you're asking whether a kernel has its own page table that must be switched to from other contexts like what is done between user-processes, then no. The code above shows that a context switch into the kernel (or between kernel tasks) can just use the currently active page tables, since all process page tables contain kernel code mappings.

    If you're asking whether kernel code needs to be accessed through page tables (i.e. virtual address translation), then yes. When paging is on, all addresses go through translation, even if the translation is linear or direct.

    *Note: Kernel page table isolation (KPTI) complicates this statement because with this feature, each process has two PGDs, one for use in-kernel with kernel mappings and one for use in user-space without kernel mappings.