linuxassemblyarm64page-faultsve

How do I force a page to generate a pagefault on next access?


I am trying to develop a routine using SVE. SVE provides fault-avoiding loads which do not load from memory that would lead to a fault if accessed. As the CPU does not know the reason why a page is unmapped or inaccessible, it cannot distinguish between memory that would trigger an invalid page fault and memory that would trigger a major/minor page fault (which is usually transparent to the application).

SVE code using these instructions must thus be prepared for stray faults to be indicated and must retry the loads using non-faulting instructions if it requires the data after all. For example, consider a routine operating on NUL-terminated strings. A first-fault load is used to load a chunk of the string. If a fault was avoided only after a NUL character, everything is fine. But if that happened before a NUL character, we must retry the load with conventional load instructions as the string has been proven to cross into the faulting page.

If such “retry on avoided fault” paths are present in the code, they must be tested. However, it does not seem obvious to me how to prepare a page to fault (with major or minor page fault) on next access. If an all-zero page is acceptable, a possibility seems to be to just map a fresh anonymous page and make use of the kernel's lazy page allocation. However, this is not guaranteed or documented to cause the desired effect.

For arbitrary pages, the madvise system call has the MADV_PAGEOUT option, which seems like it might give the desired effect, but the man page does not document if the effect takes place immediately, and states that it might not affect certain pages. It is also unclear if the call works in the absence of swap space. Success/failure does not seem to be reported unambigously so it's unclear if a unit test can rely on this call. It would be quite bad for the unit test to silently pass due to the page not actually being unmapped when it runs.

What is the recommended course of action?

Also interested in responses for other operating systems (such as FreeBSD) and in particular also hardware-specific approaches, that may or may not be specific to ARM.


Solution

  • The approach I ended up using was to allocate a page without read/write permissions. On first access, a SIGSEGV would occur, ensuring that the access failed. In the signal handler, I then permit access to the page and return, resuming the code under test.

    While the performance of this code is likely worse than that of the approach suggested by Michael Karcher, the code clearly and obviously reaches the desired result which is more important to me.

    Here is some example code, testing a custom strtol implementation:

    #include <signal.h>
    #include <sys/mman.h>
    #include <sys/param.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    #ifndef PAGE_SIZE
    #define PAGE_SIZE 16384
    #endif
    
    long mystrtol(const char *restrict, char **restrict, int);
    
    /* a signal handler that makes testpage writable and then returns */
    static void *testpage;
    static void
    maptestpage(int sig)
    {
        mprotect(testpage, PAGE_SIZE, PROT_READ|PROT_WRITE);
    }
    
    /*
     * Call mystrtol() on the given input with a page fault after the given
     * number of characters.  Print an error if the return value is not
     * equal to what strtol() says it should be.
     */
    static void
    test_mystrtol(const char *str, size_t off)
    {
        struct sigaction sa;
        long num;
        int res;
        char *data, *cpy, *endptr;
    
        data = mmap(NULL, 2*PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, -1, 0);
        if (data == MAP_FAILED) {
            perror("mmap");
            return;
        }
    
        cpy = data + PAGE_SIZE - off;
        strcpy(cpy, str);
        mprotect(data + PAGE_SIZE, PAGE_SIZE, PROT_NONE);
        testpage = data + PAGE_SIZE;
    
        sa.sa_handler = maptestpage;
        sa.sa_flags = SA_RESETHAND;
        sigfillset(&sa.sa_mask);
        res = sigaction(SIGSEGV, &sa, NULL);
        if (res != 0) {
            perror("sigaction");
            goto end;
        }
    
        num = mystrtol(cpy, &endptr, 10);
        signal(SIGSEGV, SIG_DFL);
    
        if (num != ...) {
            /* ... */
        }
    
    end:    munmap(data, 2*PAGE_SIZE);
    }