cachingmemory-managementoperating-systemdirty-data

Why do we need to write dirty pages back to disk to evict it?


"...if a page has been modified and is thus dirty, it must be written back to disk to evict it, which is expensive." (In chapter 22 of OSTEP)

I don't know why. In order to evict it from memory, the dirty page will be moved to swap space, and then it will be moved back. Is it necessary to write again it to the disk? That means we need two disk I/Os when we evict a dirty page.


Solution

  • I think that you're combining two separate things. Swap space (which is a region of memory on disk) acts as a backing store to anonymous pages (pages that don't have a backing file). The statement you quoted is probably referring to a file-backed dirty page. This means the page came from a file in disk; there's no need for this page to go to swap space, it can just be written back to its file location on disk. Nonetheless, it must be written back to preserve the new data.

    If file-backed pages were evicted to swap space, as your post implies, you'd be correct: it'd be a waste of disk I/O to first write the dirty page back to its file on disk and then also write it to the swap space on disk. However, file-backed pages are not evicted to swap space so that is not correct.


    Swap space makes it easy to treat file-backed and anonymous pages similarly, since now both types of pages can be evicted to disk, just that anonymous pages will be evicted to swap whereas file-backed pages will go back to their normal spot in disk.

    Furthermore, clean pages never need to be written back to disk because they already exist on disk in their current state. This is true even for clean anonymous pages. That's because clean anonymous pages are just virtually allocated pages that all map to the same shared zeroed page. So there would be no need to swap this memory. But when they are written to, this triggers a COW page fault, they receive their own memory, are marked dirty, and now must be moved to to swap if evicted.