I tried this online murmur hash tool and it's outputting the correct hash for the user agent input. I have it set to Murmur Hash 3 x64 128 bit
and it's working well. I'm trying to replicate the behavior in Python. I tried the pymmh3
package but it's outputting the wrong hash, or I'm using it wrong.
How can I replicate the hash logic from that site?
import pymmh3
def testHash(input_string):
hash_result = pymmh3.hash128(input_string, seed=0, x64arch=True)
hex_hash = format(hash_result & ((1 << 128) - 1), '032x')
print(hex_hash)
testHash("5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36")
# function outputs
# 4dc49759f17d090f97fca4dadead58f9
# Correct output
# 97fca4dadead58f94dc49759f17d090f
TLDR
You use incorrect murmur3 hash function version. You need to use hash64(key, seed=0, x64arch=True, signed=False) and concatenate resulting integers.
The long story
I've done some experiments and try to find which problem you have in your code (which is not present obviously). First of all you should know that murmurhash3 for the x86 and x64 does not produce the same values. So it was the first thing i checked.
import mmh3
value = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
print(f"X32: {hex(mmh3.hash128(value, seed=0, x64arch=False))}")
print(f"X64: {hex(mmh3.hash128(value, seed=0, x64arch=True))}")
Unfortunately the results are not right
> python3 test.py
X32: 0x136c7e7981107207236f645a11137545
X64: 0x4dc49759f17d090f97fca4dadead58f9
I tried to check other versions of hash
import mmh3
value = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
t = mmh3.hash64(value, seed=0, x64arch=False)
print(f"X32: {hex(t[0])}{hex(t[1])}")
t = mmh3.hash64(value, seed=0, x64arch=True)
print(f"X64: {hex(t[0])}{hex(t[1])}")
And results are still not right
> python test_1.py
X32: 0x236f645a111375450x136c7e7981107207
X64: -0x68035b252152a7070x4dc49759f17d090f
But the minus sign in the last string provided me the idea to change version of hash result to unsigned. And voila!
import mmh3
value = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
t = mmh3.hash64(value, seed=0, x64arch=False, signed=False)
print(f"X32: {hex(t[0])}{hex(t[1])}")
t = mmh3.hash64(value, seed=0, x64arch=True, signed=False)
print(f"X64: {hex(t[0])}{hex(t[1])}")
One of the result is the same
> python3 test_1.py
X32: 0x236f645a111375450x136c7e7981107207
X64: 0x97fca4dadead58f90x4dc49759f17d090f
So, the Javascript code on the site is computing murmur3 hash with x64 version and unsigned integer in result. But result of the python hashing function execution is the tuple of 64bit integers instead of 128bit integer, so you need to concatenate them to achieve same resulting string.