cmurmurhash

Murmurhash2 Unsigned Int overflow


I'm currently trying to implement a hashtable/trie, but when I pass in parameters to murmurhash2, it gives back a number but I get run time errors of unsigned int overflow:

test.c:53:12: runtime error: unsigned integer overflow: 24930 * 1540483477 cannot be represented in type 'unsigned int'

test.c:60:4: runtime error: unsigned integer overflow: 2950274797 * 1540483477 cannot be represented in type 'unsigned int' 6265

I've put a bunch of stars(*) on the lines 53 and 60

I'm not sure if I'm passing some parameters wrong. Any help would be greatly appreciated!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed );

int main(void)
{
   const char* s= "aa";
   unsigned int number= MurmurHash2 (s, (int)strlen(s), 1) % 10000;
   printf("%u\n", number);
}

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
// 'm' and 'r' are mixing constants generated offline.
// They're not really 'magic', they just happen to work well.

const unsigned int m = 0x5bd1e995;
const int r = 24;

// Initialize the hash to a 'random' value

unsigned int h = seed ^ len;

// Mix 4 bytes at a time into the hash

const unsigned char * data = (const unsigned char *)key;

while(len >= 4)
{
    unsigned int k = *(unsigned int *)data;

    k *= m;
    k ^= k >> r;
    k *= m;

    h *= m;
    h ^= k;

    data += 4;
    len -= 4;
}

// Handle the last few bytes of the input array

switch(len)
{
case 3: h ^= data[2] << 16;
case 2: h ^= data[1] << 8;
case 1: h ^= data[0];
        h *= m; ************************************************
};

// Do a few final mixes of the hash to ensure the last few
// bytes are well-incorporated.

h ^= h >> 13;
h *= m;   **************************************
h ^= h >> 15;

return h;
}

Solution

  • It seems that you're building with the UBSan option -fsanitize=unsigned-integer-overflow or some other option like -fsanitize=integer that enables this check. The documentation says:

    Note that unlike signed integer overflow, unsigned integer is not undefined behavior. However, while it has well-defined semantics, it is often unintentional, so UBSan offers to catch it.

    In the case of MurmurHash, unsigned integer overflow in multiplications is fully intentional, so you should disable the option.

    Another note: Your code seems to be copied from the 32-bit reference implementation of MurmurHash2 which assumes 32-bit ints. You should consider using uint32_t instead.