armbenchmarkingcpu-architecturecortex-mcortex-a

ARM Cortex M7 MPU shareablility impact on M7 performance


I am running a system testcase in which QSPI, SRAM, DRAM and device (peripheral) memories MPU regions are kept as shareable in ARM_MPU_RASR. The testcase is doing SRAM-to-SRAM cacheable copy operation. This configuration results into much lower M7 performance ~70MB/s. When the shareability is disabled for all except device memory, the performance is substantially increased to ~600 MB/s. Can someone please explain reason behind this behavior? What is difference between CM7's MPU shareable and CA53's MMU shareable attribute?


Solution

  • According to the ARM Cortex-M7 Processor Technical Reference Manual (TRM):

    By default, only Normal, Non-shareable memory regions can be cached in the RAMs. Caching only takes place if the appropriate cache is enabled and the memory type is cacheable. Shared cacheable memory regions can be cached if CACR.SIWT is set to 1.

    So, here, it seems the SRAM region is being treated as non-cacheable, which resulted in lower throughput.