This downloads image to a disk:
image = open permalink_url, "rb", &:read
...
File.binwrite "images/#{hash}", image
Sometimes it comes corrupted:
while there was no exception.
UPD: Imagemagick says identify
"reports if an image is incomplete or corrupt" but it does not:
$ identify temp.png
temp.png PNG 1080x1080 1080x1080+0+0 8-bit sRGB 2.126MB 0.000u 0:00.049
Here are two corrupted images:
UPD: I redownloaded the image and did some analysis -- the bad variation has 300000 extra bytes somewhere in the middle broken in a lot of pieces. Garbage is not just 0x00 but looks random.
Use any of the image handling gems, e.g. chunky_png
:
require 'chunky_png'
begin
ChunkyPNG::Datastream.from_file('bad.png')
rescue ChunkyPNG::CRCMismatch
puts "png corrupted!"
end
Edit: Datastream
is more efficient than Image
in this case.
Edit 2: If you want to be able to validate any format that ImageMagick can handle and don't mind calling external binaries, this should work:
unless system('identify', '-verbose', 'bad.jpg', out: IO::NULL, err: IO::NULL)
puts "the file can't be opened or is corrupted"
end