javahashmapsethashcodehash-code-uniqueness

Accuracy of a Set with a HashMap backing's contains() method?


Hi I'm using a Set backed by a HashMap to keep track of what edges I have already traversed in a graph. I was planning on keying the set by the result of adding the hashcodes of the data stored in each edge.

v.getData().hashCode() + wordV.getData().hashCode()

But how reliable is this when using contains to check to see if the edge is in the set? Couldn't I hypothetically get a false positive? Is there anyway to overcome this?

the exact statement that causes me concern is:

edgeSet.contains(v.getData().hashCode() + wordV.getData().hashCode())

thanks!

Oh by the way I'm using Java.

EDIT:

I should have made this clear in the question. In my graph there is no edge object, there are vertex objects which each hold a list of more vertex objects, which is the edge. So I guess the question that follows from that combined with your responses is:

Can I use a Set to store references to information as opposed to objects....? i.e. can I store the result of adding the two hashcodes for the vertices' data objects?

EDIT2:

I am indeed using the Java library for my hashmap, I declare it as below:

Set<Integer> edgeSet = Collections.newSetFromMap(new ConcurrentHashMap<Integer, Boolean>());

Solution

  • Note: from your question I couldn't tell whether you were using HashSet, or your own home rolled implementation. Be aware that Java's HashSet is simply a wrapper for a HashMap where the values are ignored. HashSet.contains simply calls containsKey on the inner map.

    HashMap.containsKey uses the same lookup as get. That will compute the hash and use it to find the right bucket. From there it will walk the bucket and use equals until it finds the exact match. Assuming the element type implements both hashCode and equals correctly, there's no chance of getting a false positive using containsKey since equals is ultimately used to confirm.

    The relevant source code is in the package-private method getEntry, which is used by both containsKey and get:

    final Entry<K,V> getEntry(Object key) {
        int hash = (key == null) ? 0 : hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }
    

    EDIT:

    Can I use a Set to store references to information as opposed to objects....? i.e. can I store the result of adding the two hashcodes for the vertices' data objects?

    No, you would need to implement a new class representing this information and store instances of it in the Set. This could be a simple POJO with a field for each piece of information, and with hashCode and equals overridden correctly.