encryptioncryptographyaeslarge-fileson-the-fly

encrypting and/or decrypting large files (AES) on a memory and storage constrained system, with "catastrophe recovery"


I have a fairly generic question, so please pardon if it is a bit vague.

So, let's a assume a file of 1GB, that needs to be encrypted and later decrypted on a given system.

Problem is that the system has less than 512 mb of free memory and about 1.5 GB storage space (give or take), so, with the file "onboard" we have about ~500 MB of "hard drive scratch space" and less than 512 mb RAM to "play with".

The system is not unlikely to experience an "unscheduled power down" at any moment during encryption or decryption, and needs to be able to successfully resume the encryption/decryption process after being powered up again (and this seems like an extra-unpleasant nut to tackle).

The questions are:

1) is it at all doable :) ?

2) what would be the best strategy to go about

a) encrypting/decrypting with so little scratch space (can't have the entire file lying around while decrypting/encrypting, need to truncate it "on the fly" somehow...)

and

b) implementing a disaster recovery that would work in such a constrained environment?

P.S.: The cipher used has to be AES.

I looked into AES-CTR specifically but it does not seem to bode all that well for the disaster recovery shenanigan in an environment where you can't keep the entire decrypted file around till the end...

[edited to add] I think I'll be doing it the Iserni way after all.


Solution

  • It is doable, provided you have a means to save the AES status vector together with the file position.

    1. Save AES status and file position P to files STAGE1 and STAGE2
    2. Read one chunk (say, 10 megabytes) of encrypted/decrypted data
    3. Write the decrypted/encrypted chunk to external scratch SCRATCH
    4. Log the fact that SCRATCH is completed
    5. Write SCRATCH over the original file at the same position
    6. Log the fact that SCRATCH has been successfully copied
    7. Goto 1

    If you get a hard crash after stage 1, and STAGE1 and STAGE2 disagree, you just restart and assume the stage with the earliest P to be good. If you get a hard crash during or after stage 2, you lose 10 megabytes worth of work: but the AES and P are good, so you just repeat stage 2. If you crash at stage 3, then on recovery you won't find the marker of stage 4, and so will know that SCRATCH is unreliable and must be regenerated. Having STAGE1/STAGE2, you are able to do so. If you crash at stage 4, you will BELIEVE that SCRATCH must be regenerated, even if you could avoid this -- but you lose nothing in regenerating except a little time. By the same token, if you crash during 5, or before 6 is committed to disk, you just repeat stages 5 and 6. You know you don't have to regenerate SCRATCH because stage 4 was committed to disk. If you crash after stage 1, you will still have a good SCRATCH to copy.

    All this assumes that 10 MB is more than a cache's (OS + hard disk if writeback) worth of data. If it is not, raise to 32 or 64 MB. Recovery will be proportionately slower.

    It might help to flush() and sync(), if these functions are available, after every write-stage has been completed.

    Total write time is a bit more than twice normal, because of the need of "writing twice" in order to be sure.