javahashbittorrenttorrentinfo-hash

Unable to create a torrent's info hash


I'm having trouble finding the issue with how I'm generating the corresponding info hash for a torrent file. This is the code I have so far:

InputStream input = null;
try {
    MessageDigest sha1 = MessageDigest.getInstance("SHA-1");
    input = new FileInputStream(file);
    StringBuilder builder = new StringBuilder();
    while (!builder.toString().endsWith("4:info")) {
       builder.append((char) input.read()); // It's ASCII anyway.
    }
    ByteArrayOutputStream output = new ByteArrayOutputStream();
    for (int data; (data = input.read()) > -1; output.write(data));
    sha1.update(output.toByteArray(), 0, output.size() - 1);
    this.infoHash = sha1.digest();
    System.out.println(new String(Hex.encodeHex(infoHash)));
} catch (NoSuchAlgorithmException | IOException e) {
     e.printStackTrace();
} finally {
    if (input != null) try { input.close(); } catch (IOException ignore) {}
}

Below is my expected and actual hash:

Expected: d4d44272ee5f5bf887a9c85ad09ae957bc55f89d
Actual: 4d753474429d817b80ff9e0c441ca660ec5d2450

The torrent I'm trying to generate an info hash for can be found here (Ubuntu 14.04 Desktop amd64).

Let me know if I can provide any more info, thanks!


Solution

  • Exceptions contain 4 useful bits of info: Type, Message, Trace, and Cause. You've tossing away 3 out of the 4 relevant bits of info. Also, code is part of a process, and when an error occurs, generally that process cannot be finished at all. And yet on exceptions your process continues. Stop doing this; you've written code that only hurts you. Remove the try, and the catch. Add a throws clause on your method signature. If you can't, the go-to default (and update your IDE if that generated this code to do this) is throw new RuntimeException("Unhandled", e);. This is shorter, does not destroy any of the 4 interesting bits of info, and ends a process.

    Separately, the notion that the right way to handle an inputstream close method's IOException being: Just ignore it, is also false. It is highly unlikely to throw, but if it does, you should assume you didn't read every byte. As that would be one explanation for a mismatched hash, it's misguided.

    Finally, use the proper language constructs: There is a try-with-resources statement that would work far better here.

    You're calling update with output.size() - 1; unless you want to intentionally ignore the last byte, this is a mistake; you're lopping off the last byte read.

    Reading bytes into a builder, and then per byte converting the builder to a string and then checking the last character is incredibly inefficient; for a file as small as 1MB that'll cause quite a grind.

    Reading a single byte at a time from a raw FileInputStream is also that level of inefficient, because every read will cause file access (reading 1 byte is as expensive as reading a whole buffer full, so, it's about 50000 times slower than it needs to be).

    Here's how to do this with somewhat newer API, and look how much nicer this code reads. It also acts better under erroneous conditions:

    byte[] data = Files.readAllBytes(Paths.get(fileName));
    var search = "4:info".getBytes(StandardCharsets.US_ASCII);
    int searchIdx = -1;
    for (int i = 0; searchIdx == -1 && i < data.length - search.length; i++) {
        for (int j = 0; j < search.length; j++) {
            if (data[i + j] != search[j]) break;
            if (j == search.length - 1) searchIdx = i + j;
        }
    }
    if (searchIdx == -1) throw new IOException("Input torrent file does not contain marker");
    
    var sha1 = MessageDigest.getInstance("SHA-1");
    sha1.update(data, searchIdx, data.length - searchIdx);
    byte[] hash = sha1.digest();
    StringBuilder hex = new StringBuilder();
    for (byte h : hash) hex.append(String.format("%02x", h));
    System.out.println(hex);