I am implementing a hash function in an object and as my hash value for such object I use the username hashed value, i.e:
class DiscordUser:
def __init__(self, username):
self.username = username
def __hash__(self):
return hash(self.username)
The problem arises when adding such objects to the hash set and comparing them with the exact same username as input for the constructor, i.e:
user = DiscordUser("Username#123")
if user in users_set:
# user is already in my users_set, such condition is NEVER MET, dont understand why
else:
# add user to users_set, this condition is met ALWAYS
users_set.add(user)
Why the hash funcion is not working as properly, or what im doing wrong here?
The hash function is working properly, set
membership uses __hash__()
, but if two objects have the same hash, set
will use the __eq__()
method to determine whether or not they are equal. Ultimately, set
guarantees that no two elements are equal, not that no two elements have equal hashes. The hash value is used as a first pass because it is often less expensive to compute than equality.
There is no guarantee that any two objects with the same hash are in fact equal. Consider that there are infinite values for self.name
in your DiscordUser
. Python uses siphash for hashing str
values. Siphash has a finite range, therefore collisions must be possible.
Be careful about using a mutable value as input to hash()
. The hash value of an object is expected to be the same for its lifetime.
Take a look at this answer for some nice info about set
s, hashing, and equality testing in Python.
edit: Python uses siphash for str
values since 3.4