javahashtablelinear-probing

how to Compute the average probe length for success and failure - Linear probe (Hash Tables)


I'm doing an assignment for my Data Structures class. we were asked to to study linear probing with load factors of .1, .2 , .3, ...., and .9. The formula for testing is:

The average probe length using linear probing is roughly

Success--> ( 1 + 1/(1-L)**2)/2
or
Failure--> (1+1(1-L))/2.

we are required to find the theoretical using the formula above which I did(just plug the load factor in the formula), then we have to calculate the empirical (which I not quite sure how to do). here is the rest of the requirements

**For each load factor, 10,000 randomly generated positive ints between 1 and 50000 (inclusive) will be inserted into a table of the "right" size, where "right" is strictly based upon the load factor you are testing. Repeats are allowed. Be sure that your formula for randomly generated ints is correct. There is a class called Random in java.util. USE it! After a table of the right (based upon L) size is loaded with 10,000 ints, do 100 searches of newly generated random ints from the range of 1 to 50000. Compute the average probe length for each of the two formulas and indicate the denominators used in each calculationSo, for example, each test for a .5 load would have a table of > > size approximately 20,000 (adjusted to be prime) and similarly each test for a .9 load would have a table of approximate size 10,000/.9 (again adjusted to be prime).

The program should run displaying the various load factors tested, the average probe for each search (the two denominators used to compute the averages will add to 100), and the theoretical answers using the formula above. .**

how do I calculate the empirical success?

here is my code so far:

import java.util.Random;
/**
 *
 * @author Johnny
 */
class DataItem
{
    private int iData;
    public DataItem(int it)
    {iData = it;}
    public int getKey()
    {
        return iData;
    }
}

class HashTable
{
private DataItem[] hashArray;
private int arraySize;
public HashTable(int size)
{
    arraySize = size;
    hashArray = new DataItem[arraySize];
}
public void displayTable()
{
    int sp=0;
    System.out.print("Table: ");
    for(int j=0; j<arraySize; j++)
{
    if(sp>50){System.out.println("");sp=0;}

    if(hashArray[j] != null){
        System.out.print(hashArray[j].getKey() + " ");sp++;}
    else
    {System.out.print("** "); sp++;}
}
    System.out.println("");
}

public int hashFunc(int key)
{
    return key %arraySize;
}

public void insert(DataItem item)
{
    int key = item.getKey();
    int hashVal = hashFunc(key);

    while(hashArray[hashVal] != null &&
                    hashArray[hashVal].getKey() != -1)
    {
        ++hashVal;
        hashVal %= arraySize;
    }
    hashArray[hashVal]=item;
}
public int hashFunc1(int key)
{
    return key % arraySize;
}

public int hashFunc2(int key)
{
// non-zero, less than array size, different from hF1
// array size must be relatively prime to 5, 4, 3, and 2
    return 5 - key % 5;
}


public DataItem find(int key) // find item with key
// (assumes table not full)
    {
    int hashVal = hashFunc1(key); // hash the key
    int stepSize = hashFunc2(key); // get step size
    while(hashArray[hashVal] != null) // until empty cell,
    { // is correct hashVal?
        if(hashArray[hashVal].getKey() == key)
            return hashArray[hashVal]; // yes, return item
        hashVal += stepSize; // add the step
        hashVal %= arraySize; // for wraparound
    }
    return null; // can’t find item
    }
}
public class n00645805 {
/**
 * @param args the command line arguments
 */
public static void main(String[] args) {
    double b=1;
    double L;
    double[] tf = new double[9];
    double[] ts = new double[9];
    double d=0.1;
    DataItem aDataItem;
    int aKey;
    HashTable h1Table = new HashTable(100003); //L=.1
    HashTable h2Table = new HashTable(50051);  //L=.2
    HashTable h3Table = new HashTable(33343);  //L=.3
    HashTable h4Table = new HashTable(25013);  //L=.4
    HashTable h5Table = new HashTable(20011);  //L=.5
    HashTable h6Table = new HashTable(16673);  //L=.6
    HashTable h7Table = new HashTable(14243);  //L=.7
    HashTable h8Table = new HashTable(12503);  //L=.8
    HashTable h9Table = new HashTable(11113);  //L=.9

    fillht(h1Table);
    fillht(h2Table);
    fillht(h3Table);
    fillht(h4Table);
    fillht(h5Table);
    fillht(h6Table);
    fillht(h7Table);
    fillht(h8Table);
    fillht(h9Table);
    pm(h1Table);
    pm(h2Table);
    pm(h3Table);
    pm(h4Table);
    pm(h5Table);
    pm(h6Table);
    pm(h7Table);
    pm(h8Table);
    pm(h9Table);

    for (int j=1;j<10;j++)
    {
        //System.out.println(j);
        L=Math.round((b-d)*100.0)/100.0;
        System.out.println(L);
        System.out.println("ts "+(1+(1/(1-L)))/2);
        System.out.println("tf "+(1+(1/((1-L)*(1-L))))/2);
        tf[j-1]=(1+(1/(1-L)))/2;
        ts[j-1]=(1+(1/((1-L)*(1-L))))/2;
        d=d+.1;
    }
    display(ts,tf);
}
public static void fillht(HashTable a)
{
    Random r = new Random();
    for(int j=0; j<10000; j++)
    {
        int aKey;
        DataItem y;
        aKey =1+Math.round(r.nextInt(50000));
        y = new DataItem(aKey);
        a.insert(y);

    }
}
public static void pm(HashTable a)
{
    DataItem X;
    int numsuc=0;
    int numfail=0;
    int aKey;
    Random r = new Random();
    for(int j=0; j<100;j++)
    {
        aKey =1+Math.round(r.nextInt(50000));
        X = a.find(aKey);
        if(X != null)
        {
            //System.out.println("Found " + aKey);
            numsuc++;
        }
        else
        {
            //System.out.println("Could not find " + aKey);
            numfail++;
        }

    }
    System.out.println("# of succ is "+ numsuc+" # of failures is "+ numfail);
}
public static void display(double[] s, double[] f)
{

}

}


Solution

  • You should take into account that Java's HashTable uses a closed addressing (no probing) implementation, so you have separate buckets in which many items can be placed. This is not what you are looking for in your benchmarks. I'm not sure about HashMap implementation but I think it uses open addressing too.

    So forget about JDK classes.. since you want to calculate empirical values you should write your own version of an hashtable that uses the open addressing implementation with linear probing but you should take care of counting the probe length whenever you try to get a value from the hashmap..

    For example you can write your hashmap and then take care of having

    class YourHashMap
    {
       int empiricalGet(K key)
       {
         // search for the key but store the probe length of this get operation
    
         return probeLength;
       }
    }
    

    Then you can easily benchmark it by searching how many keys you want and calculating the average probe length.

    Otherwise you can just provide the hasmap the ability of storing the total probe length and the count of gets requested and retrieve them after the benchmark run to calculate average value.

    This kind of exercises must prove that the empirical value concordates with the theoretical one. So take also into account the fact that you may need many benchmarks, and then do the average of them all, assuring that variance is not too high.