windowsfilesystemsntfs

How to pre-allocate file on Windows (NTFS) without writing the whole file


I have an app that needs to preallocate space for a potentially very file on disk (a few TB). The file should occupy space on disk, such that allocated space can't be used by something else.

Using either SetFileInformationByHandle and SetFilePointerEx with SetEndOfFile suggested in https://stackoverflow.com/a/25119897/3806795 does achieve allocation, but there is a big issue: when I write something near the end of the allocated file, Windows seems to start writing the whole actual file to disk (I assume filling the gap between start of the file and the actual write with zeroes).

This results in "System" process using 100% of I/O of the disk until the write is finished, which as you can imagine with multi-terabyte file takes a while even on SSD and wastes its write resource, especially when file is close to the size of the drive itself.

Is there a way to allocate a non-sparse file on Windows 10/11 (NTFS) that doesn't have such a problem? I don't experience such issues with fallocate on Linux.


Solution

  • What is happening in the original scenario is the file is fully allocated but VDL (valid data length) is set to zero. VDL is a high-water mark of valid data in the file. When writing beyond VDL, NTFS must zero all space between current VDL and the beginning of where the new data is written. This is a security requirement to not expose the former contents of the disk to a new user.

    In the original scenario this is why IO starts occurring because to advance VDL to where data was written, it has to zero everything in-between.

    There is a way to work around this via the SetFileValidData API (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilevaliddata). Beware this can only be done from an elevated admin application and is considered a security issue as previous disk contexts are being exposed.

    As mentioned using a sparse file is another option.

    The system is behaving by design.