javapythongojedismurmurhash

Murmurhash of different language version get different result


I've tried three version of murmurhash in java(jedis and guava), go and python. The result of java(guava),go and python version output same hash code but different with java(jedis). All the murmurhash code are shown as follow. I'm confused about the result. I've seen this issue and use Long.reverseBytes in java, and still different with others. So what should I do to make all the output of murmurhash keep same. Thanks~

1. java version(jedis)

java gradle compile group: 'redis.clients', name: 'jedis', version: '3.1.0'

import redis.clients.jedis.util.MurmurHash;

MurmurHash murmurhash = new MurmurHash();
long h = murmurhash.hash("foo");
System.out.println(h);
System.out.println(Long.reverseBytes(h));

output:

-7063922479176959649

6897758107479832477

2. golang version

import "github.com/spaolacci/murmur3"

foo := int64(murmur3.Sum64WithSeed([]byte("foo"), 0x1234ABCD))
fmt.Println(foo)

output:

-5851200325189400636

3. python version

pip install mmh3

import mmh3

foo = mmh3.hash64('foo', seed=0x1234ABCD, signed=True)
print(foo)

output:

-5851200325189400636

4. java(guava)

java gradle compile group: 'com.google.guava', name: 'guava', version: '28.0-jre'

import com.google.common.hash.Hashing

long foo = Hashing.murmur3_128(0x1234ABCD).hashString("foo", charset.forName("UTF-8")).asLong();
System.out.println(foo);

output:

-5851200325189400636

Solution

  • TL;DR
    Jedis uses Murmur2 while the other libraries use Murmur3.


    I also fell for it while migrating some code from Java/Jedis to Golang.

    The difference is due to different versions of murmur. Jedis uses, up to this day, Murmur2 (see source code and documentation), while the other above mentioned libraries use Murmur3.

    Besides looking at the comments/code, I also verified this by using the Murmur2 reference implementation. Using the same seed and key leads to the exact same results as your Jedis example.

    Code Fragment:

    const char *key = "foo";
    
    uint64_t result = MurmurHash64A(key, std::strlen(key), 0x1234ABCD);
    
    std::cout << "  result (unsigned): " << result << std::endl;
    std::cout << "    result (signed): " << (long) result << std::endl;
    std::cout << "reversed byte order: " << __builtin_bswap64(result) << std::endl;
    

    Output:

        result (unsigned): 11382821594532591967
          result (signed): -7063922479176959649
      reversed byte order:  6897758107479832477