I'm currently making an AArch32 emulator, and I'm trying to make an MMU.
While I got most of the descriptor logic, access permissions, and address translation mechanisms set up, I still have doubts over certain edgecases that came to mind.
For example, imagine I have a virtual address of a "tiny page", which maps 1Kb of virtual address space. What if I want to access an 8-byte memory region at page index 1022? That would cross the page boundary to index 1030, which is above the 1Kb page. What happens then?
The ARM documentation doesn't specify what happens (afaik), and I'm not exactly sure what to do here. So I came up with 3 hypotheses:
Am I missing something? Or am I completely getting the point wrong? Please let me know. Thanks.
reference manual I'm using: https://cdrdv2-public.intel.com/654202/ddi0100e_arm_arm.pdf
The manual you linked is for ARMv5, which is very old (the manual is dated 2000). For it, the behavior is actually different from any of your hypotheses.
You mentioned in your answer that ARM has optional alignment traps, so when enabled, an unaligned access would simply abort. However, these are optional.
When alignment traps are not enabled, the behavior of unaligned loads and stores is a little unusual. The machine calculates an address value according to the desired addressing mode, and masks off the low two bits to generate a virtual address. It translates that address and loads or stores a single aligned 32-bit word from memory. So, page crossing never actually occurs.
For a store, the low two bits of the address value are simply ignored. For a load, the low two bits are a count of bytes by which the loaded value is rotated before writing it to the destination register.
You can see this stated in the definitions of the STR
and LDR
instructions in the document you linked, as well as in Section 2.7.3, "Unaligned Accesses".
This behavior does not seem very useful, and is probably unlikely to be used deliberately except in very special cases.
It might be exposing what were originally implementation details. For instance, a natural way to implement LDRB
on a simple ARM CPU might have been to mask the low bits of the address, load a word, rotate it using the low bits so that the desired byte ends up as the least-significant byte of the word, and finally zero the high 24 bits. Then LDR
of an unaligned address would match this behavior, but leaving the high 24 bits alone instead of zeroing. (There are some differences in big-endian mode, but I believe that was a later addition to the architecture.) It could be that this behavior wasn't really intended to be used, but some obscure programs may have made use of it, so that it eventually had to be codified in the architecture spec.
Your example mentions accessing an eight-byte memory region, but I don't think ARMv5 has any such instructions, per se. LDRD/STRD
were not present in that version of the architecture. LDM/STM
simply perform the microcoded equivalent of a sequence of LDR/STR
, except that they both ignore the low bits of an unaligned address; LDM
does not rotate as LDR
does.
For ARMv6 and later, the behavior changed. For them, unaligned accesses (when enabled) behave more intuitively: all bits of the address value are used, and you get 4 bytes starting at that address, spanning two 32-bit words. If this crosses a page boundary, then those bytes are read from the two different pages, and two page translations will occur. This is essentially your hypothesis #2. If both translations require a page table walk or cause a Data Abort, I don't think it's specified which one happens first.
It's not clear what "vague holes" you see in this, but they should be explained in the manuals. (ARM's more recent manuals, e.g. ARMv8-A, are much more careful about specifying behavior in precise formal detail, though that makes them harder for a working programmer to read.)