I've gotten virtual memory working on ARMv8 after crafting the page tables. Oddly, most of my translations are working (identity mapped) save for Flash which sits at physical address zero. I use a single function that edits the page tables, so the fact that some work and some do not is strange to me. Specifically I have only a few ranges mapped:
Flash [0x00000000, len = 0x08000000]
UART [0x09000000, len = 0x1000 ]
RAM [0x40000000, len = 0x0fe00000]
Secure RAM [0x4fe00000, len = 0x00200000]
And again, they all work except for Flash. My mapping function also works for non-identity maps. There is just something strange about that Flash range.
I have an exception handler in place that I'm using to dissect the issue. I found two interesting cases when catching Data Abort
exceptions. I encountered two Data Abort sub-types depending on the type of memory being accessed:
- [1] Flash address range (e.g. 0x00000000)
- ESR.ISS = 0x10 (ISS.DFSC = 0x10)
- Synchronous External abort, not on translation table walk
- [2] An expected unmapped address (e.g. 0x50000000)
- ESR.ISS = 0x06 (ISS.DFSC = 0x06)
- Synchronous External abort, on translation table walk, level 2
When attempting to handle the exception for accessing an address I do not expect to be in the table I get a [2] (fault at level 2 because some nearby addresses were mapped).
When I attempt to handle the exception for accessing Flash which I do expect to be in the table I get [1] (not in table walk).
So, I am confused on what these two cases represent. What is the difference between [1] and [2]? They seem to represent the same thing. Does [1] somehow represent the case that translation failed before it tried? There are defined level 0 faults I would expect to handle if that were the case. I was expecting the "Not in table" fault for the address I was not expecting to be in the table but instead received the other.
As it turned out, the issue I was seeing was due to Flash being mapped as a Secure-only device, so only secure accesses will make it through the MMU (i.e. NSTable=0 on table entries and NS=0 on the block entries).
After realizing this, and with help from @artlessnoise regarding "External Aborts" in the comments I've come to the following distinction between the two subtypes of Data Abort:
"Not on translation table walk"
Does not mean, as I was reading it, that "The exception occurred because the requested memory address was not found 'on the translation table walk'". But rather it means that the exception occurred while "not on the translation table walk" (i.e. before or after the operation). In this case the Flash PA was mapped in the tables but my accesses were Non-Secure, so the device was not responding (MMU was not routing them through to the device). This lack of response caused the external abort. As such, ARM may have more succinctly defined this exception as, for example:
"Not on translation table walk" -> "Not as a result of translation table walk"
The other subtype:
"On translation table walk, level 2"
Is what one should expect to receive during a page fault
. In other words, an attempt was made to read memory that does not have a VA->PA mapping in the translation tables. the level observed reflects how far the MMU walks the tables before stopping.
Note that I removed the "Synchronous External abort" part of the definitions. Both are considered External aborts because both are raised as a result of operations outside of the CPU while attempting to read memory or directly from the MMU.