I have a dataclass whose instances I want to hash and order, using the id
member as a key.
from dataclasses import dataclass, field
@dataclass(eq=True, order=True)
class Category:
id: str = field(compare=True)
name: str = field(default="set this in post_init", compare=False)
I know that I can implement __hash__
myself. However, I would like dataclasses to do the work for me because they are intended to handle this.
Unfortunately, the above dataclass fails:
a = sorted(list(set([ Category(id='x'), Category(id='y')])))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'Category'
From the docs:
Here are the rules governing implicit creation of a
__hash__()
method:[...]
If
eq
andfrozen
are both true, by defaultdataclass()
will generate a__hash__()
method for you. Ifeq
is true andfrozen
is false,__hash__()
will be set toNone
, marking it unhashable (which it is, since it is mutable). Ifeq
is false,__hash__()
will be left untouched meaning the__hash__()
method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).
Since you set eq=True
and left frozen
at the default (False
), your dataclass is unhashable.
You have 3 options:
frozen=True
(in combination with the default eq=True
), which will make your class immutable and hashable.
@dataclass(frozen=True)
unsafe_hash=True
, which will create a __hash__
method but leave your class mutable.
@dataclass(unsafe_hash=True)
Mutability risks problems if an instance of your class is modified while stored in a dict or set:
cat = Category('foo', 'bar')
categories = {cat}
cat.id = 'baz'
print(cat in categories) # False
__hash__
method.