I am now doing some tests of my application again corrupted files. But I found it is hard to find test files.
So I'm wondering whether there are some existing tools, which can write random/garbage bytes into a file of some format.
Basically, I need this tool to:
Thanks.
The /dev/urandom
pseudo-device, along with dd
, can do this for you:
dd if=/dev/urandom of=newfile bs=1M count=10
This will create a file newfile
of size 10M.
The /dev/random
device will often block if there is not sufficient randomness built up, urandom
will not block. If you're using the randomness for crypto-grade stuff, you can steer clear of urandom
. For anything else, it should be sufficient and most likely faster.
If you want to corrupt just bits of your file (not the whole file), you can simply use the C-style random functions. Just use rnd()
to figure out an offset and length n
, then use it n
times to grab random bytes to overwrite your file with.
The following Perl script shows how this can be done (without having to worry about compiling C code):
use strict;
use warnings;
sub corrupt ($$$$) {
# Get parameters, names should be self-explanatory.
my $filespec = shift;
my $mincount = shift;
my $maxcount = shift;
my $charset = shift;
# Work out position and size of corruption.
my @fstat = stat ($filespec);
my $size = $fstat[7];
my $count = $mincount + int (rand ($maxcount + 1 - $mincount));
my $pos = 0;
if ($count >= $size) {
$count = $size;
} else {
$pos = int (rand ($size - $count));
}
# Output for debugging purposes.
my $last = $pos + $count - 1;
print "'$filespec', $size bytes, corrupting $pos through $last\n";
# Open file, seek to position, corrupt and close.
open (my $fh, "+<$filespec") || die "Can't open $filespec: $!";
seek ($fh, $pos, 0);
while ($count-- > 0) {
my $newval = substr ($charset, int (rand (length ($charset) + 1)), 1);
print $fh $newval;
}
close ($fh);
}
# Test harness.
system ("echo =========="); #DEBUG
system ("cp base-testfile testfile"); #DEBUG
system ("cat testfile"); #DEBUG
system ("echo =========="); #DEBUG
corrupt ("testfile", 8, 16, "ABCDEFGHIJKLMNOPQRSTUVWXYZ ");
system ("echo =========="); #DEBUG
system ("cat testfile"); #DEBUG
system ("echo =========="); #DEBUG
It consists of the corrupt
function that you call with a file name, minimum and maximum corruption size and a character set to draw the corruption from. The bit at the bottom is just unit testing code. Below is some sample output where you can see that a section of the file has been corrupted:
==========
this is a file with nothing in it except for lowercase
letters (and spaces and punctuation and newlines).
that will make it easy to detect corruptions from the
test program since the character range there is from
uppercase a through z.
i have to make it big enough so that the random stuff
will work nicely, which is why i am waffling on a bit.
==========
'testfile', 344 bytes, corrupting 122 through 135
==========
this is a file with nothing in it except for lowercase
letters (and spaces and punctuation and newlines).
that will make iFHCGZF VJ GZDYct corruptions from the
test program since the character range there is from
uppercase a through z.
i have to make it big enough so that the random stuff
will work nicely, which is why i am waffling on a bit.
==========
It's tested at a basic level but you may find there are edge error cases which need to be taken care of. Do with it what you will.