cperfect-hash

near perfect or perfect hash of memory addresses in c


I have a list of memory addresses from 0xc0003000 to 0xc04a0144 there are many gaps and < 4096 entries in the list. It is known at compile time and I want to make a perfect hash for it.

However looking up perfect hashing online gives me information mostly related to hashing strings and they don't seem to translate well.

To be clear I want to be able to obtain the memory address at run time and check that it is in the hash quickly. Currently I am using a binary search which is on average about 8 loops to find the answer.

Any ideas what tree I should bark up?


Solution

  • Here's a sample gperf program. I included a NUL and a newline in the sample data to prove that they don't cause it to fail.

    %{
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <inttypes.h>
    #include <arpa/inet.h>
    %}
    %%
    "\xc0\x01\x02\x03"
    "\xc0\xff\xff\xff"
    "\xc0\xff\x00\xff"
    "\xc0\x0a\xff\xff"
    %%
    int main(int argc, const char **argv)
    {
        int i;
    
        for(i=1;i<argc;++i) {
            uint32_t addr = ntohl(strtoul(argv[i], 0, 16));
            if(in_word_set((char *)&addr, 4))
                printf("0x%08"PRIx32" is in the list.\n", htonl(addr));
            else
                printf("0x%08"PRIx32" is not in the list.\n", htonl(addr));
        }
        return 0;
    }
    

    Save as addrs.gperf, compile and test with

    gperf -l addrs.gperf > addrs.c
    gcc addrs.c -o addrs
    ./addrs c0000000 c0010203 c0ffffff c00affff c0ff0aff c0ffff00 c0ff00ff