pythonpython-typingstructural-typing

Python typing: concrete class that implements protocol with method that takes another protocol as argument


I'm trying to get the typing correct for the following example:

from __future__ import annotations

from dataclasses import dataclass

from typing_extensions import Protocol


@dataclass(frozen=True)
class Animal(Protocol):
    age: int


@dataclass(frozen=True)
class Cat:
    age: int


@dataclass(frozen=True)
class Dog:
    age: int


class AnimalRepository(Protocol):
    def get(self, animals: list[Animal]) -> None:
        ...


class CatRepository:
    def get(self, animals: list[Cat]) -> None:
        pass


class DogRepository:
    def get(self, animals: list[Dog]) -> None:
        pass


ANIMAL_TYPE_REPOSITORY_MAP: dict[type[Animal], AnimalRepository] = {
    Cat: CatRepository(),
    Dog: DogRepository(),
}

This results in the following problem:

Expression of type "dict[type[Cat] | type[Dog], CatRepository | DogRepository]" cannot be assigned to declared type "dict[type[Animal], AnimalRepository]"
  "CatRepository" is incompatible with protocol "AnimalRepository"
    "get" is an incompatible type
      Type "(animals: list[Cat]) -> None" cannot be assigned to type "(animals: list[Animal]) -> None"
        Parameter 1: type "list[Animal]" cannot be assigned to type "list[Cat]"
          "list[Animal]" is incompatible with "list[Cat]"
  "DogRepository" is incompatible with protocol "AnimalRepository"
    "get" is an incompatible type
      Type "(animals: list[Dog]) -> None" cannot be assigned to type "(animals: list[Animal]) -> None"

So the specific repositories don't properly implement the AnimalRepository as I'm using the implementations of the Animal protocol, i.e., Cat, Dog, rather than the protocol itself.

Am I able to resolve this issue while keeping the same structure?

I suspect this is related to a Liskov Substitution Principle issue but I'm not familiar enough to know how to handle this case. Any help would be much appreciated.

Attempted using generics and type variables without any success.


Solution

  • As written, your CatRepository and DogRepository do not satisfy AnimalRepository. Taking a look at CatRepository, it has the following method.

    def get(self, animals: list[Cat]) -> None:
    

    But AnimalRepository expects a method that takes a list[Animal], not a list[Cat]. AnimalRepository says "I can take a list of any animal and do something", but your method only works with cats, not all animals.

    You can change AnimalRepository to be generic.

    T = TypeVar("T", bound=Animal)
    
    class AnimalRepository(Protocol[T]):
        def get(self, animals: list[T]) -> None:
            ...
    

    Now CatRepository satisfies AnimalRepository[Cat] and DogRepository satisfies AnimalRepository[Dog]. Those types don't have anything in common, since they accept different arguments, but at least all of the code is in the same place.

    There's no way to type your ANIMAL_TYPE_REPOSITORY_MAP, unfortunately. What you're looking for is a path-dependent map. That is, you want a map for which the type of the value depends on the value of the key. That's wildly complicated to implement. We could do it in a dependently-typed proof language like Agda, and it's sort-of possible in a language with an insanely powerful type system like Scala (with Shapeless). But Python's type hinting is a fairly simple system, from a type theory perspective.

    Here's what I recommend. You have a good idea for an interface. You want a way to give a type of Animal and get an AnimalRepository[T] where T is that type. We can write this type in Python, but we just can't prove the function implementation is correct. So I recommend using class AnimalRepository(Protocol[T]) as defined above. Then define your ANIMAL_TYPE_REPOSITORY_MAP with Any typing.

    _ANIMAL_TYPE_REPOSITORY_MAP: dict[type[Animal], Any] = {
        Cat: CatRepository(),
        Dog: DogRepository(),
    }
    

    Note that I prefix this with an underscore, since we don't want to export it. This dictionary is now an implementation detail.

    Now expose it with a function that casts the type problems away.

    def get_animal_repository(animal_type: type[T]) -> AnimalRepository[T]:
        if animal_type in _ANIMAL_TYPE_REPOSITORY_MAP:
            # Note: The Any type happily casts to whatever we want.
            return _ANIMAL_TYPE_REPOSITORY_MAP[animal_type]
        else:
            # Your choice what to do in the case of an invalid animal.
            ...
    

    The implementation of this function definitely cheats with Any. But callers, and anyone external to your module, still sees a nice type-safe API.