javajava-iojava-nio

Java InputStream: copy to and calculate hash at the same time


Here are my two code snippets:

public class Uploader {

  private static final String SHA_256 = "SHA-256";

  public String getFileSHA2Checksum(InputStream fis) throws IOException {
    try {
      MessageDigest md5Digest = MessageDigest.getInstance(SHA_256);
      return getFileChecksum(md5Digest, fis);
    } catch (NoSuchAlgorithmException e) {
      return "KO";
    }
  }

  public void transferTo(InputStream fis) throws IOException {
    FileUtils.copyInputStreamToFile(fis, file2);
  }

My code uses this class as:

Is it possible to copyToFile and calculateChecksum at the same time leveraging InputStream is open?


Solution

  • You can use the DigestInputStream to calculate a hash while reading from a stream. That is, you wrap the original input stream with a DigestInputStream and read through the DigestInputStream. While reading the data, the message digest is automatically updated, and you can retrieve the digest after you read the entire stream.

    Alternatively, you can use DigestOutputStream to calculate a hash while writing to a stream. In a similar vein, you wrap the destination output stream with a DigestOutputStream and write through the DigestOutputStream.

    A quick and dirty example:

    var inputFile = Path.of("D:\\Development\\data\\testdata-csv\\customers-1000.csv");
    var outputFile = Files.createTempFile("testoutput", ".dat");
    var md = MessageDigest.getInstance("SHA-256");
    try (var in = new DigestInputStream(Files.newInputStream(inputFile), md);
         var out = Files.newOutputStream(outputFile)) {
        in.transferTo(out);
    } finally {
        Files.deleteIfExists(outputFile);
    }
    
    System.out.println(HexFormat.of().formatHex(md.digest()));
    

    In terms of your existing code, you could do something like:

    public String transferAndHash(InputStream in) throws IOException {
        try {
            var md = MessageDigest.getInstance("SHA-256");
            try (var digestIn = new DigestInputStream(in, md)) {
                transferTo(digestIn);
            }
            return HexFormat.of().formatHex(md.digest());
        } catch (NoSuchAlgorithmException e) {
            // all recent Java versions are required to support SHA-256
            throw new AssertionError("Expected SHA-256 to be supported", e);
        }
    }
    

    (NOTE: HexFormat was introduced in Java 17, if you're using an earlier version, you'll need an alternative.)