I am not sure if I am using murmur3
(OpenHFT's zero-allocation-hashing) function correctly but the result seems different for hashChars()
and hashBytes()
// Using zero-allocation-hashing 0.16
String input = "abc123";
System.out.println(LongHashFunction.murmur_3().hashChars(input));
System.out.println(LongHashFunction.murmur_3().hashBytes(input.getBytes(StandardCharsets.UTF_8)));
Output:
-4878457159164508227
-7432123028918728600
The latter one produces the same output as Guava lib.
Which function should be used for String
inputs?
Shouldn't both functions produce the same result?
Update:
How can I get same output as :
Hashing.murmur3_128().newHasher().putString(input, Charsets.UTF_8).hash().asLong();
Hashing.murmur3_128().newHasher().putString(input, Charsets.UTF_8).hash().toString()
using zero-allocation-hashing
lib which seems to be faster than Guava
Your assumption regarding UTF-8
is not correct, it holds for StandardCharsets.UTF_16LE
.
String input = "abc123";
System.out.println(LongHashFunction.murmur_3().hashChars(input));
System.out.println(LongHashFunction.murmur_3().hashBytes(
input.getBytes(StandardCharsets.UTF_16LE)
));
gives:
-4878457159164508227
-4878457159164508227
For the desired:
Hashing.murmur3_128().newHasher().putString(input, Charsets.UTF_8).hash().asLong();
this:
LongHashFunction.murmur_3().hashBytes(input.getBytes(StandardCharsets.UTF_8));
seems to work (please test more!)
The (hex) string conversion is sort of a problem, since the guava hash creates (really) 128 bits (16 bytes, 2 longs), whereas "your lib" gives us only 64 bits!
Half of the digits i can reproduce with:
...
thx to:
With your help (sorry first time encounter this lib), I could finally:
System.out.println("Actual: " +
toHexString(
LongTupleHashFunction.murmur_3().hashBytes(
input.getBytes(StandardCharsets.UTF_8)
)
)
);
where:
private static final String toHexString(long[] hashLongs) {
StringBuilder sb = new StringBuilder(hashLongs.length * Long.BYTES * 2);
for (long lng : hashLongs)
for (int i = 0; i < Long.BYTES; i++) {
byte b = (byte) (lng >> (i * Long.BYTES));
sb.append(HEX_DIGITS[(b >> 4) & 0xf]).append(HEX_DIGITS[b & 0xf]);
}
return sb.toString();
}
private static final char[] HEX_DIGITS = "0123456789abcdef".toCharArray();