I am getting an error from mypy that I cannot explain (and therefore cannot fix):
build_rag.py:116: error:
Argument "metadatas" to "add" of "AsyncCollection"
has incompatible type "list[dict[str, str]]";
expected "Mapping[str, str | int | float | bool] |
list[Mapping[str, str | int | float | bool]] |
None" [arg-type]
I am passing a list[dict[str,str]]
as detected correctly by mypy. I would have expected this to match against the second of the three expected types: list[Mapping[str, str|int|float|bool]]
. I am not the author of the add()
function being called. (It is the chromadb.AsyncCollection.add()
method.)
Am I being stupid/blind, or should mypy have accepted my input?
The code calling the function is this, with ids and chunks defined earlier. The code
variable is already a string; the call to str() was just an act of desperation on my part. The path
variable is a pathlib.Path
object. chunks
is a list[str]
.
metadata = {
"dc.identifier": str(path),
"code": str(code),
}
metadatas = [metadata for chunk in chunks]
await collection.add(
ids=ids,
documents=chunks,
metadatas=metadatas,
)
This is the only mypy error flagged in my file so, fingers crossed, the rest of the types are as advertised.
A list[dict[str,str]]
isn't a list[Mapping[str, str|int|float|bool]]
, because you can add a dict[str, float]
to a list[Mapping[str, str|int|float|bool]]
, but you can't add a dict[str, float]
to a list[dict[str,str]]
.
chromadb should probably use Sequence
instead of list
to annotate sequence arguments they don't need to modify. A list[dict[str,str]]
would be a valid Sequence[Mapping[str, str|int|float|bool]]
, because the Sequence
ABC doesn't specify any insertion operations.
In the meantime, you can explicitly annotate your variables differently, overriding the default type inference:
metadata: Mapping[str, str|int|float|bool] = # whatever