filemd5sumvlang

Memory efficient computation of md5sum of a file in vlang


The following code read a file into bytes and computes the md5sum of the bytes array. It works but I would like to find a solution in V that need less RAM. Thanks for your comments !

import os
import crypto.md5

b := os.read_bytes("file.txt") or {panic(err)}

s := md5.sum(b).hex()

println(s)

I also tried without success :

import os
import crypto.md5
import io

mut f := os.open_file("file.txt", "r")?

mut h := md5.new()

io.cp(mut f, mut h)?

s := h.sum().hex()

println(s) // does not return the correct md5sum

Solution

  • Alrighty. This is what you're looking for. It produces the same result as md5sum and is only slightly slower. block_size is inversely related to the amount of memory used and speed at which the checksum is computed. Decreasing block_size will lower the memory footprint, but takes longer to compute. Increasing block_size has the opposite effect. I tested on a 2GB manjaro disc image and can confirm the memory usage is very low.

    Note: It seems this does perform noticeably slower without the -prod flag. The V compiler makes special optimizations in order to run faster for the production build.

    import crypto.md5
    import io
    import os
    
    fn main() {
        println(hash_file('manjaro.img')?)
    }
    
    const block_size = 64 * 65535
    
    fn hash_file(path string) ?string {
        mut file := os.open(path)?
        defer {
            file.close()
        }
        mut buf := []u8{len: block_size}
        mut r := io.new_buffered_reader(reader: file)
        mut digest := md5.new()
        for {
            x := r.read(mut buf) or { break }
            digest.write(buf[..x])?
        }
        return digest.checksum().hex()
    }