memory-managementtlbmemory-fragmentation

How can fragmented physical memory cause TLB thrashing?


I read this passage on https://en.wikipedia.org/wiki/Fragmentation_(computing) : "However, if the working set is fragmented, then it will not fit into 64 pages, and execution will slow due to thrashing: pages will be repeatedly added and removed from the TLB during operation."

I am confused about why fragmented physical memory would cause "not fit into 64 pages". I understand that even if physical pages are fragmented, there should still be 64 physical and virtual pages occupying 64 TLB entries. In this scenario, shouldn't cache hits or misses be unrelated to external fragmentation? Could someone clarify this for me?


Solution

  • I think this is not talking about physical memory fragmentation, but more of cache vs TLB behaviour. In this article the example is: "program has a working set of 256 KiB, and is running on a computer with a 256 KiB cache" and 64 TLB entries.

    If we assume those 256 KB are all neatly packaged into 64 4kB virtual pages then they are going to be backed by 64 4kB physical pages; all the mappings are going to fit into TLB (well, maybe they will; that depends on the TLB architecture), and all the data will also fit in the cache.

    If we assume that some fragmentation has occurred and the used data is now distributed in 4 kB chunks with 4 kB holes between them, we still get only 64 actively used virtual and physical pages. The overall picture doesn't change much from the first example - entire data still fits into TLB/cache.

    If we now assume that some fragmentation has occurred and the used data is now evenly distributed in 1 kB chunks with unused 1 kB holes between them. So the data size is still 256 kB, they still fit into the cache (since each cacheline is usually 64 byte wide), but the data is now spread among 128 virtual and physical pages and heavy TLB thrashing is going to be inevitable.

    This part of the article is kind of misleading because of various possible definitions of working set. I think their working definition is "amount of frequently accessed data, in bytes" - under this definition the article makes sense. However working set is usually measured on page granularity, as a number of virtual pages allocated to the process (indeed, even the Wiki article linked from this article says "typically the units of information in question are considered to be memory pages"). Under this definition a working set of 256 KiB will inevitable use exactly 64 4 KiB pages and whole example falls apart.