gohashstreamsuffix

Golang calculate hash from io.Reader and append to the end - Multireader


I've got a reader in representing some file data which I want to process, but also calculate hash which would then be appended at the end of it.

The challenge is that data hash can only be calculated once in is fully processed/consumed.

My first idea was to use: MultiReader.

// HASH GENERATION - This should be called after `in` is consumed
preHasher, err := hash.NewMultiHasherTypes(hash.NewHashSet(hash.MD5))
if err != nil {
    return nil, err
}
in = io.TeeReader(in, preHasher)

hash := preHasher.Sums()[hash.MD5] 
byteHash, err := hex.DecodeString(hash)
if err != nil {
    return nil, err
}
// HASH GENERATION - END

// Create reader from hash
hashEndReader := io.LimitReader(bytes.NewReader(byteHash), 16)

// Append hash to the end
newInput := io.MultiReader(in, hashEndReader)

The issue is that I need to call a function to generate a hash after in reader is fully consumed. That's not the case in the above example.

Any idea what construct could be used? Ideally hashEndReader would be generated dynamically (or some lazy load), where hash generation function is called just after in is consumed.


Solution

  • Use this code:

    type hashReader struct {
        // If h != nil, then r is the original reader,
        // else r is a reader on the hash bytes.
        r io.Reader
        // h is set to nil when original reader is read
        // to EOF.
        h hash.Hash
    }
    
    func (hr *hashReader) Read(b []byte) (int, error) {
        n, err := hr.r.Read(b)
        // Hash the bytes when reading the main stream.
        if hr.h != nil {
            hr.h.Write(b[:n])
            // If we reached the end of the main stream, swap in
            // a reader on the hash bytes.
            if err == io.EOF {
                sum := hr.h.Sum(nil)
                hr.r = bytes.NewReader(sum[:])
                hr.h = nil
                err = nil
            }
        }
        return n, err
    }
    

    Example use:

    hr := &hashReader{
        r: bytes.NewReader(input),
        h: md5.New(),
    }
    

    Test case on the playground.