windowsfilesystemsntfsmemory-mapped-io

Detect unclean filesystem shutdown


I have a project where we manipulate large amounts of cached data using memory mapped files. We use Windows 10, NTFS and .NET.

When the user starts the application, we detect if the previous program session was shutdown correctly, and if so we reuse the cache.

However, this is a pain for developers when debugging. It's quite common to just stop the program being debugged. At next startup, the cached data needs to be recalculated, which takes time and is annoying.

So, we've been thinking we could introduce a 'transaction log', so that we can recover even if the previous shutdown was unclean.

Now for the actual problem.

There seems to be no guarantees in which order memory mapped files are flushed. In case the program is just stopped, there is no problem, since the entire memory mapped file will be flushed to disk by the operating system. The problem comes if there is a power cut. In this case, there are no guarantees what state the file is in. Our "transaction log" doesn't help either, unless we always flush the transaction log to disk before modifying the cache. This would defeat the purpose of our architecture, since it would introduce unacceptable performance penalties.

If we could somehow know that our memory mapped file on disk was previously left in a state where the OS didn't manange to flush all pages before operating system shutdown, we could just throw the entire file away at next startup. There would be a delay, but it would be totally acceptable since it would only occur after a power cut or similar event.

When the operating system boots, it knows that the file is possibly corrupt, because it knows the filesystem was not cleanly unmounted.

And finally, my question:

Is there some way to ask Windows if the file system was clean when it was mounted?


Solution

  • NTFS periodically commits its own logs and so there's a window in which a power fail could occur and NTFS would (correctly) state that the volume (as in, "NTFS DATA" not user data) is clean.

    You will likely have to do what databases do which is to lock your cache into physical memory so that you can control the writes-to-disk.