pythonpython-typingmypychromadb

Unexpected error from mypy: Why isn't my type acceptable?


I am getting an error from mypy that I cannot explain (and therefore cannot fix):

build_rag.py:116: error: 
Argument "metadatas" to "add" of "AsyncCollection" 
has incompatible type "list[dict[str, str]]"; 
expected "Mapping[str, str | int | float | bool] | 
list[Mapping[str, str | int | float | bool]] | 
None"  [arg-type]

I am passing a list[dict[str,str]] as detected correctly by mypy. I would have expected this to match against the second of the three expected types: list[Mapping[str, str|int|float|bool]]. I am not the author of the add() function being called. (It is the chromadb.AsyncCollection.add() method.)

Am I being stupid/blind, or should mypy have accepted my input?

The code calling the function is this, with ids and chunks defined earlier. The code variable is already a string; the call to str() was just an act of desperation on my part. The path variable is a pathlib.Path object. chunks is a list[str].

            metadata = {
                "dc.identifier": str(path),
                "code": str(code),
            }
            metadatas = [metadata for chunk in chunks]
            await collection.add(
                ids=ids,
                documents=chunks,
                metadatas=metadatas,
            )

This is the only mypy error flagged in my file so, fingers crossed, the rest of the types are as advertised.


Solution

  • A list[dict[str,str]] isn't a list[Mapping[str, str|int|float|bool]], because you can add a dict[str, float] to a list[Mapping[str, str|int|float|bool]], but you can't add a dict[str, float] to a list[dict[str,str]].

    chromadb should probably use Sequence instead of list to annotate sequence arguments they don't need to modify. A list[dict[str,str]] would be a valid Sequence[Mapping[str, str|int|float|bool]], because the Sequence ABC doesn't specify any insertion operations.

    In the meantime, you can explicitly annotate your variables differently, overriding the default type inference:

    metadata: Mapping[str, str|int|float|bool] = # whatever