clinuxmmapvirtual-memory

mmap memory backed by other memory?


I'm not sure if this question makes sense, but let's say I have a pointer to some memory:

char *mem;
size_t len;

Is it possible to somehow map the contents of mem to another address as a read-only mapping? i.e. I want to obtain a pointer mem2 such that mem2 != mem and accessing mem2[i] actually reads mem[i] (without doing a copy).

My ultimate goal would be to take non-contiguous chunks of memory and make them appear to be contiguous by mapping them next to each other.

One approach I considered is to use fmemopen and then mmap, but there's no file descriptor associated with the result of fmemopen.


Solution

  • General case - no control over first mapping

    /proc/[PID]/pagemap + /dev/mem

    The only way I can think of making this work without any copying is by manually opening and checking /proc/[PID]/pagemap to get the Page Frame Number of the physical page corresponding to the page you want to "alias", and then opening and mapping /dev/mem at the corresponding offset. While this would work in theory, it would require root privileges, and is most likely not possible on any reasonable Linux distribution since the kernel is usually configured with CONFIG_STRICT_DEVMEM=y, which restricts the usage of /dev/mem. For example, on x86 it disallows reading system RAM from /dev/mem and only allows reading memory-mapped PCI regions. Furthermore, note that in order for this to work the page you want to "alias" needs to be locked to keep it in RAM.

    In any case, here's an example of how this would work if you were able/willing to do this (I am assuming x86 64bit here):

    #include <stdio.h>
    #include <errno.h>
    #include <limits.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <fcntl.h>
    
    /* Get the physical address of an existing virtual memory page and map it. */
    
    int main(void) {
        FILE *fp;
        char *endp;
        unsigned long addr, info, physaddr, val;
        long off;
        int fd;
        void *mem;
        void *orig_mem;
    
        // Suppose that this is the existing page you want to "alias"
        orig_mem = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
        if (orig_mem == MAP_FAILED) {
            perror("mmap orig_mem failed");
            return 1;
        }
    
        // Write a dummy value just for testing
        *(unsigned long *)orig_mem = 0x1122334455667788UL;
    
        // Lock the page to prevent it from being swapped out
        if (mlock(orig_mem, 0x1000)) {
            perror("mlock orig_mem failed");
            return 1;
        }
    
        fp = fopen("/proc/self/pagemap", "rb");
        if (!fp) {
            perror("Failed to open \"/proc/self/pagemap\"");
            return 1;
        }
    
        addr = (unsigned long)orig_mem;
        off  = addr / 0x1000 * 8;
    
        if (fseek(fp, off, SEEK_SET)) {
            perror("fseek failed");
            return 1;
        }
    
        // Get its information from /proc/self/pagemap
        if (fread(&info, sizeof(info), 1, fp) != 1) {
            perror("fread failed");
            return 1;
        }
    
        physaddr = (info & ((1UL << 55) - 1)) << 12;
    
        printf("Value: %016lx\n", info);
        printf("Physical address: 0x%016lx\n", physaddr);
    
        // Ensure page is in RAM, should be true since it was mlock'd
        if (!(info & (1UL << 63))) {
            fputs("Page is not in RAM? Strange! Aborting.\n", stderr);
            return 1;
        }
    
        fd = open("/dev/mem", O_RDONLY);
        if (fd == -1) {
            perror("open(\"/dev/mem\") failed");
            return 1;
        }
    
        mem = mmap(NULL, 0x1000, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, fd, physaddr);
        if (mem == MAP_FAILED) {
            perror("Failed to mmap \"/dev/mem\"");
            return 1;
        }
    
        // Now `mem` is effecively referring to the same physical page that
        // `orig_mem` refers to.
    
        // Try reading 8 bytes (note: this will just return 0 if
        // CONFIG_STRICT_DEVMEM=y).
        val = *(unsigned long *)mem;
    
        printf("Read 8 bytes at physaddr 0x%016lx: %016lx\n", physaddr, val);
    
        return 0;
    }
    

    userfaultfd(2)

    Other than what I described above, AFAIK there isn't a way to do what you want from userspace without copying. I.E. there is not a way to simply tell the kernel "map this second virtual addresses to the same memory of an existing one". You can however register an userspace handler for page faults through the userfaultfd(2) syscall and ioctl_userfaultfd(2), and I think this is overall your best shot.

    The whole mechanism is similar to what the kernel would do with a real memory page, only that the faults are handled by a user-defined userspace handler thread. This is still pretty much an actual copy, but is atomic to the faulting thread and gives you more control. It could potentially also perform better in general since the copying is controlled by you and can therefore be done only if/when needed (i.e. at the first read fault), while in the case of a normal mmap + copy you always do the copying regardless if the page will ever be accessed later or not.

    There is a pretty good example program in the manual page for userfaultfd(2) which I linked above, so I'm not going to copy-paste it here. It deals with one or more pages and should give you an idea about the whole API.

    Simpler case - control over the first mapping

    In the case you do have control over the first mapping which you want to "alias", then you can simply create a shared mapping. What you are looking for is memfd_create(2). You can use it to create an anonymous file which can then be mmaped multiple times with different permissions.

    Here's a simple example:

    #define _GNU_SOURCE
    #include <stdio.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <sys/types.h>
    
    int main(void) {
            int memfd;
            void *mem_ro, *mem_rw;
    
            // Create a memfd
            memfd = memfd_create("something", 0);
            if (memfd == -1) {
                    perror("memfd_create failed");
                    return 1;
            }
    
            // Give the file a size, otherwise reading/writing will fail
            if (ftruncate(memfd, 0x1000) == -1) {
                    perror("ftruncate failed");
                    return 1;
            }
    
            // Map the fd as read only and private
            mem_ro = mmap(NULL, 0x1000, PROT_READ, MAP_PRIVATE, memfd, 0);
            if (mem_ro == MAP_FAILED) {
                    perror("mmap failed");
                    return 1;
            }
    
            // Map the fd as read/write and shared (shared is needed if we want
            // write operations to be propagated to the other mappings)
            mem_rw = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, memfd, 0);
            if (mem_rw == MAP_FAILED) {
                    perror("mmap failed");
                    return 1;
            }
    
            printf("ro mapping @ %p\n", mem_ro);
            printf("rw mapping @ %p\n", mem_rw);
    
            // This write can now be read from both mem_ro and mem_rw
            *(char *)mem_rw = 123;
    
            // Test reading
            printf("read from ro mapping: %d\n", *(char *)mem_ro);
            printf("read from rw mapping: %d\n", *(char *)mem_rw);
    
            return 0;
    }