operating-systemcompiler-optimizationriscvxv6virtual-address-space

Why is 38 written as 9 + 9 + 9 +12 - 1 in xv6 RISC-V source code


I was trying to add a few system calls to the xv6 source code developed at MIT, and upon reading this resource (https://pdos.csail.mit.edu/6.S081/2020/xv6/book-riscv-rev1.pdf), on page 26, they consider the maximum possible virtual address that XV6RISC-V supports.

When I tried to look at the source code for the definition of MAXVA (MAXimum Virtual Address), I came across the following code snippet.

// one beyond the highest possible virtual address.
// MAXVA is actually one bit less than the max allowed by
// Sv39, to avoid having to sign-extend virtual addresses
// that have the high bit set.
#define MAXVA (1L << (9 + 9 + 9 + 12 - 1))

I was intrigued by the expression '9 + 9 + 9 + 12 - 1' instead of simply writing '38'. I tried to look up the underlying reasoning for this but did not find anything related. Is this some kind of optimization? If so, at what level is this relevant and where else could this be relevant?

I have some experience in assembly language programming and understand the basics of how the C code written is being translated to the assembly structure and how the final assembly code corresponding to bit shifting might look like, (using x86 salq and sarq or RISC-V slli). Any hints/ thoughts would be appreciated as well.


Solution

  • The answer could be guessed from vm.c

    // The risc-v Sv39 scheme has three levels of page-table
    // pages. A page-table page contains 512 64-bit PTEs.
    // A 64-bit virtual address is split into five fields:
    //   39..63 -- must be zero.
    //   30..38 -- 9 bits of level-2 index.
    //   21..29 -- 9 bits of level-1 index.
    //   12..20 -- 9 bits of level-0 index.
    //    0..11 -- 12 bits of byte offset within the page.
    

    For the compiler, write 9+9+9+12-1 is the same than 38, but for the human -aware of bits fields- reading this, the first is more clear.