pythonpydantic

pydantic model: How to exclude field from being hashed / eq-compared?


I have the following hashable pydantic model:

class TafReport(BaseModel, frozen=True):
    download_date: dt
    icao: str
    issue_time: dt
    validity_time_start: dt
    validity_time_stop: dt
    raw_report: str

Now I don't want these reports to be considered different just because their download date is different (I insert that with the datetime.now()). How can i exclude download_date from being considered in the __hash__ and __eq__ functions so that I can do stunts like:

tafs = list(set(tafs))

and have a unique set of tafs even though two might have differing download date? I'm looking for a solution where I don't have to overwrite the __hash__ and __eq__ methods...

I checked out this topic but it only answers how to exclude a field from the model in general (so it doesn't show up in the json dumps), but I do want it to show up in the json dump.


Solution

  • Unfortunately there is no built-in option at the moment, but there are two options that you can consider:

    Changing from BaseModel to a Pydantic dataclass:

    from dataclasses import field
    from datetime import datetime as dt
    from pydantic import TypeAdapter
    from pydantic.dataclasses import dataclass
    
    @dataclass(frozen=True)
    class TafReport:
        download_date: dt = field(compare=False)
        icao: str
        issue_time: dt
        validity_time_start: dt
        validity_time_stop: dt
        raw_report: str
    
    TafReportAdapter = TypeAdapter(TafReport)
    
    SameTime = dt.now()
    
    TafReport1 = TafReport(download_date=dt.now(),
                           icao='icao',
                           issue_time=SameTime,
                           validity_time_start=SameTime,
                           validity_time_stop=SameTime,
                           raw_report='raw_report')
    
    TafReport2 = TafReport(download_date=dt.now(),
                           icao='icao',
                           issue_time=SameTime,
                           validity_time_start=SameTime,
                           validity_time_stop=SameTime,
                           raw_report='raw_report')
    
    print(TafReportAdapter.dump_json(TafReport1), hash(TafReport1))
    print(TafReportAdapter.dump_json(TafReport2), hash(TafReport2))
    

    This will give the same hash while the download_date is different.

    Exclude the download_date from the model and allow extra fields:

    from datetime import datetime as dt
    from pydantic import BaseModel
    
    class TafReport(BaseModel, frozen=True, extra='allow'):
        icao: str
        issue_time: dt
        validity_time_start: dt
        validity_time_stop: dt
        raw_report: str
    
    SameTime = dt.now()
    
    TafReport1 = TafReport(icao='icao',
                           issue_time=SameTime,
                           validity_time_start=SameTime,
                           validity_time_stop=SameTime,
                           raw_report='raw_report',
                           download_date=dt.now())
    
    TafReport2 = TafReport(icao='icao',
                           issue_time=SameTime,
                           validity_time_start=SameTime,
                           validity_time_stop=SameTime,
                           raw_report='raw_report',
                           download_date=dt.now())
    
    print(TafReport1.model_dump(), hash(TafReport1))
    print(TafReport2.model_dump(), hash(TafReport2))
    

    In this case the hash function is build based on the fields provided in the model. But allowing extra fields without defining them in the model gives you the ability to add the download_date without affecting the hash function build in the model.