bashddmd5sumdiskimage

Using md5sum for speeding up dd disk imaging, sample script: Good idea?


I was thinking of ways to have my laptop HDD backed up safely, and still being able to put the backup rapidly in use if needed. My method would be the following: I would buy an 2.5" HDD of the same size with USB to SATA cable and clone the internal to it, when disaster strikes, I would just have to swap the HDD in the laptop for the other one and I would be good to go again. However, I would like to avoid writing 500GB each time I want to backup my HDD, especially when I know that a fair part of it (+/- 80GB) is rarely written to, this is where the following md5sum/dd script comes to the rescue, I hope:

#!/bin/bash
block="1M"
end=50000
count=10

input="/dev/sda"

output="/dev/sdb"
output="/path/to/imagefile"


function md5compute()
{
    dd if=$1 skip=$2 bs=$block count=$count | md5sum - | awk '{ print $1 }'
}
for i in {0..$end}
do
    start=$(($i*$count))
    md5source=$(md5compute $input $start)
    md5destination=$(md5compute $output $start)
    if [ "$md5source" != "$md5destination" ]
    then
        dd if=$input of=$output skip=$start seek=$start count=$count conv=sync,noerror,notrunc
    fi
done

Now, the question part:

A) With running this, would I miss some part of the disk? Do you see some flaws?

B) Could I win some time compared to the 500GB read/write?

C) Obviously I potentially write less to the target disk. Will I improve the lifetime of that disk?

D) I was thinking of leaving count to 1, and increasing the block size.Is this good idea/bad idea?

E) Would this same script work with an image file as output?

Not being very fluent in programming, there should be plenty of room for improvement, any tips?

Thank you all...


Solution

  • Point by point answer:

    1. With running this, would I miss some part of the disk?
    1. Do you see some flaws?
    1. Could I win some time compared to the 500GB read/write?
    1. Obviously I potentially write less to the target disk. Will I improve the lifetime of that disk?
    1. I was thinking of leaving count to 1, and increasing the block size.Is this good idea/bad idea?
    1. Would this same script work with an image file as output?

    Functionnality answer.

    For jobs like this, you may use rsync! With this tools you may

    Using , dd and md5sum

    There is a kind of command I run sometime:

    ssh $USER@$SOURCE "dd if=$PATH/$SRCDEV |tee >(sha1sum >/dev/stderr);sleep 1" |
        tee >(sha1sum >/dev/tty) | dd of=$LOCALPATH/$LOCALDEV
        
    

    This will do a full read on souce host, than a sha1sum before sending to localhost (destination), than a sha1sum to ensure transfer before writting to local device.

    This may render something like:

    2998920+0 records in
    2998920+0 records out
    1535447040 bytes (1.4gB) copied, 81.42039 s, 18.3 MB/s
    d61c645ab2c561eb10eb31f12fbd6a7e6f42bf11  -
    d61c645ab2c561eb10eb31f12fbd6a7e6f42bf11  -
    2998920+0 records in
    2998920+0 records out
    1535447040 bytes (1.4gB) copied, 81.42039 s, 18.3 MB/s