windowspowershellfilesystemscorruptionrefs

How to programmatically simulate file corruption to test ReFS Health Check and Recovery features?


I would like to programmatically test Windows ReFS Health Check and Recovery features.

Note: ReFS only detects bitrot (no self-healing). To have ReFS both detect and auto-heal, one must also use Storage Spaces. So, I have prepared a Storage Mirror Space pool S:\ with 2-way mirror setup.

ReFS integrity streams have been enable with,

PS C:\> Set-FileIntegrity -FileName 'S:\' -Enable $True

as per instructions found here.

How can I programmatically simulate file corruption to test ReFS Health Check and Recovery features?

I can't find an easy way to introduce bit-rot. All system I tried were performing only changes acceptable to ReFS as legitimate.

A PowerShell method would be best, if possible. Perl, Python or any other good too.

Thank you in advance.


Solution

  • To create corruption, use the destructive write test within Hard Disk Sentinel Pro. Set it to work randomly rather than sequentially. I set it to write random patterns of bits. Just run it for one to three minutes, and you’ll see on the displayed map a whole bunch of spots all over the drive getting destroyed.

    Here’s how I did some testing (I’m typing fast so I hope I don’t leave something out)

    1. Nearly fill an ReFS mirrored storage space with files.

    2. Enable file integrity for all the files:

      Get-ChildItem -Path ‘i:*’ -Recurse | Set-FileIntegrity -Enable $True -Enforce $False

    We do another test later with Enforce $True, but do the false one first. You’ll see why later. Read up on Enable and Enforce.

    1. Remove one of the drives and attach it to a SATA port on a second computer.
    2. On that second computer, introduce file corruption with Hard Disk Sentinel
    3. Remove the corrupted drive and put it back in the first computer with the storage space. You will now have a mirrored storage space where one drive is okay and one has a bunch of corrupted files.

    Try a mass copy of all the files from the storage space over to some other drive.

    My tests show that almost nothing gets repaired and almost nothing shows up in the event log. Maybe one or two errors and that’s it. You might think perhaps not much was corrupted in the first place. Well now set Enforce $True and do the copy operation again. With Enforce on, the copy will stop at dozens of files with checksum errors---proving that ReFS in that case is looking at checksums.

    Problem is that again almost nothing shows up in the log. Also, with Enforce on, I got a checksum error on the one file that had supposedly been fixed during the first test with Enforce off!

    Check these threads:

    Why Use ReFS?

    ReFS test with corrupt data. Does it work

    Has anyone run the Data Integrity Scan on an ReFS volume?

    ReFS/Storage Spaces does log a problem from time to time and people see that so they just assume it works great. Also, folks can't find a good way to create test corruption so they don't bother testing. I tested on the Windows 10 Pro for Workstations SKU and the results are terrible.

    Please run some tests yourself to confirm my findings.