pythonset

Limits the scope of object comparison to some of the attributes when adding to a set


I have a Person class like this:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return '<Person {}>'.format(self.name)

I want to add some instances of this class to a set, like this:

tom = Person('tom', 18)
mary = Person('mary', 22)
mary2 = Person('mary2', 22)

person_set = {tom, mary, mary2}
print(person_set)
# output: {<Person tom>, <Person mary>, <Person mary2>}

As you can see, there are 2 Marys in the set. How can I make it so that Person instances with the same age are considered the same person, and only added to the set once?

In other words, how can I get a result of {<Person tom>, <Person mary>}?


Solution

  • When a new object is being added to a python set, the hash code of the object is first computed and then, if one or more objects with the same hash code is/are already in the set, these objects are tested for equality with the new object.

    The upshot of this is that you need to implement the __hash__(...) and __eq__(...) methods on your class. For example:

    class Person:
        def __init__(self, name, age):
            self.name = name
            self.age = age
    
        def __eq__(self, other):
            return self.age == other.age
    
        def __hash__(self):
            return hash(self.age)
    
        def __repr__(self):
            return '<Person {}>'.format(self.name)
    
    tom = Person('tom', 18)
    mary = Person('mary', 22)
    mary2 = Person('mary2', 22)
    
    person_set = {tom, mary, mary2}
    print(person_set)
    # output: {<Person tom>, <Person mary>}
    

    However, you should think very carefully about what the correct implementation of __hash__ and __eq__ should be for your class. The above example works, but is non-sensical (e.g. in that both __hash__ and __eq__ are defined only in terms of age).