hash-function

Hashfunction to map combinations of 5 to 7 cards


Referring to the original problem: Optimizing hand-evaluation algorithm for Poker-Monte-Carlo-Simulation

I have a list of 5 to 7 cards and want to store their value in a hashtable, which should be an array of 32-bit-integers and directly accessed by the hashfunctions value as index. Regarding the large amount of possible combinations in a 52-card-deck, I don't want to waste too much memory.

Numbers:

Storing 157 million 32-bit-integer values costs about 580MB. So I would like to avoid increasing this number by reserving memory in an array for values that aren't needed.

So the question is: How could a hashfunction look like, that maps each possible, non duplicated combination of cards to a consecutive value between 0 and 156.742.040 or at least comes close to it?


Solution

  • Paul Senzee has a great post on this for 7 cards (deleted link as it is broken and now points to a NSFW site).

    His code is basically a bunch of pre-computed tables and then one function to look up the array index for a given 7-card hand (represented as a 64-bit number with the lowest 52 bits signifying cards):

    inline unsigned index52c7(unsigned __int64 x)
    {
        const unsigned short *a = (const unsigned short *)&x;
        unsigned A    = a[3],                B    = a[2],                        C    = a[1],            D   = a[0],
                 bcA  = _bitcount[A],        bcB  = _bitcount[B],                bcC  = _bitcount[C],    bcD = _bitcount[D],
                 mulA = _choose48x[7 - bcA], mulB = _choose32x[7 - (bcA + bcB)], mulC = _choose16x[bcD];
        return _offsets52c[bcA]                      + _table4[A] * mulA + 
               _offsets48c[ (bcA << 4)        + bcB] + _table [B] * mulB +
               _offsets32c[((bcA + bcB) << 4) + bcC] + _table [C] * mulC + 
                                                       _table [D];
    }
    

    In short, it's a bunch of lookups and bitwise operations powered by pre-computed lookup tables based on perfect hashing.

    If you go back and look at this website, you can get the perfect hash code that Senzee used to create the 7-card hash and repeat the process for 5- and 6-card tables (essentially creating a new index52c7.h for each). You might be able to smash all 3 into one table, but I haven't tried that.

    All told that should be ~628 MB (4 bytes * 157 M entries). Or, if you want to split it up, you can map it to 16-bit numbers (since I believe most poker hand evaluators only need 7,462 unique hand scores) and then have a separate map from those 7,462 hand scores to whatever hand categories you want. That would be 314 MB.