linuxcachingdmaarmv8

Invalidate range by virtual address in dcache_inval_poc(start,end); ARMV8; Cache;


I'm confused by the implementation of the dcache_inval_poc (start, end) as follows: https://github.com/torvalds/linux/blob/v5.15/arch/arm64/mm/cache.S#L134. There is no sanity check for the "end" address, but what will happen if the range (start, end) passes from the upper layer, like dma_sync_single_for_cpu/dma_sync_single_for_device, beyond the L1 data cache size? eg: dcache_inval_poc(start, start+256KB), but L1 D-cache size is 32KB

After going through the source code of the dcache_inval_poc (start, end) https://github.com/torvalds/linux/blob/v5.15/arch/arm64/mm/cache.S#L152 , I tried to convert the loop code to Pseudo-Code in C as the following:

x0_kaddr = start;

while ( start < end){

dc_civac( x0_kaddr );

x0_kaddr += cache_line_size;

}

If "end - start" > L1 D-cache size, the loop will still run, however, the "x0_kaddr" address no longer exists in the D-cache.


Solution

  • Your confusion comes from fact that you thinking in terms of cache lines somehow mapped on top of some memory range. But function is Invalidate range by virtual address in terms of available mapped memory.

    So far as start and end parameters are valid virtual addresses of general memory that's fine.


    Memory range does not have to be cached as a whole, only some data out of given range might be cached or none at all.

    So say there is 2MB buffer in physical DDR memory that's mapped and could be accessed by virtual addresses.
    Say L1 is 32KB.
    So up to 32KB out of 2MB buffer might be cached (or none at all). You don't know what part, if any, is in cache.
    For that reason you run a loop over virtual addresses of your 2MB buffer. If data block of cache_line_size is in cache, that cache line would be invalidated. If data is not in cache and only in DDR memory, that's basically a nop.

    It's good practice to provide start and end addresses aligned to cache_line_size, because memory controller would clip lower bits and you might miss cleaning some data in buffer tail.

    PS: if you want to operate directly on cache lines, there is other functions for that. And they takes way and set parameters to address directly cache lines.