pythonpython-typingpydantic

Defining a type that is both a Protocol and a Pydantic BaseModel


I need to define a type that is both a Protocol and a BaseModel.

In detail:
The model Foo has a field data that can receive any model that has an id attribute ( let's call the type HasId).

class Foo(BaseModel):
    data: HasId
    ...

If Protocol supported inheriting from other base classes I would have defined HasId like this:

class HasId(BaseModel, Protocol):
    id: str | UUID

Or using the Intersection solution proposed here here I would have define it like this:

class HasIdProtocol(Protocol):
    id: str | UUID

type HasId = Intersection[HasIdProtocol, BaseModel]
# OR
type HasId = HasIdProtocol & BaseModel

Is there a way I can still achieve this functionality currently?


Solution

  • So I'd say your problem is a fundamental limitation in Python's type system.

    If I understand correctly, you're trying to validate if a value is both a BaseModel and conforms to a Protocol interface, right?

    If so, there's currently no solution that can provide true static type checking for an intersection of Protocol and BaseModel. Python's type system simply doesn't support this concept natively, YET!

    That said, there are a few practical workarounds:

    Option 1: Runtime validation with type annotation for documentation

    from typing import Protocol, runtime_checkable
    from pydantic import BaseModel, validator
    from uuid import UUID
    
    @runtime_checkable
    class HasIdProtocol(Protocol):
        id: str | UUID
    
    class Foo(BaseModel):
        # For type checker documentation
        data: HasIdProtocol
        
        # But enforce BaseModel at runtime
        @validator('data')
        def validate_data_is_model(cls, v):
            if not isinstance(v, BaseModel):
                raise ValueError("data must be a BaseModel")
            return v
    

    This approach:

    Option 2: Type assertion function

    from typing import TypeVar, cast, Any, Protocol, runtime_checkable
    from pydantic import BaseModel
    from uuid import UUID
    
    @runtime_checkable
    class HasIdProtocol(Protocol):
        id: str | UUID
    
    T = TypeVar('T', bound=HasIdProtocol)
    
    def ensure_base_model_with_id(value: Any, model_type: type[T] = HasIdProtocol) -> T:
        """Assert that value is both a BaseModel and has an id attribute."""
        if not isinstance(value, BaseModel):
            raise TypeError(f"Expected a BaseModel, got {type(value)}")
        if not isinstance(value, model_type):
            raise TypeError(f"Value must have an id attribute")
        return value
    
    class Foo(BaseModel):
        data: HasIdProtocol
        
        def __init__(self, **data):
            super().__init__(**data)
            self.data = ensure_base_model_with_id(self.data)
    

    You might prefer this approach because:

    Python's type system might evolve to support this use case better in the future, but sadly we'd never know when.

    In practice, I've found the first approach (type annotation + runtime validation) to be the best approach for most codebases.

    Happy building!