pythonchromadb

ChromaDB: Collection {name} is not created


I am trying to get an existing ChromaDB collection with the get_or_create_collection method of a PersistentClient object but I get 'Collection "collection_name" is not created.'. I am using the official chroma package, v.0.5.0 in a pipenv environment with python 3.10.12.

Here's a snippet of the source code:

client = chromadb.PersistentClient(path="/path/to/collection/directory")
print(f"Available collections: {client.list_collections()}")    # <- this returns the collection that I want to get
embedding_function = (embedding_functions.SentenceTransformerEmbeddingFunction("all-mpnet-base-v2","cuda"))
client.get_or_create_collection("test_collection", embedding_function)

I tracked down the source of the message and it is */lib/python3.10/site-packages/chromadb/api/segment.py, line 189.

The following question from Stack Overflow mentions the exact same message, but nobody says why it is happening.

I also tried to reproduce the message by creating an empty project and pasting my code as it is. I could not get the message despite everything being the same (package version, collection directory path, collection name and embedding function). I've asserted the values of all parameters and they are always correct.

It is possible to query text from the collection despite the above message.

Is this a bug in the ChromaDB API or am I doing something wrong on my end? Every bit of help is greatly appreciated!


Solution

  • It turns out that this is a bug in the chromadb 0.5.0 python package. You can see more details and follow the discussion in the Bug Report in the Chroma GitHub Repo. Here are the details about how I found out it is a bug (from the report description):

    I also tried to reproduce the message by creating a copy of the project and changing the version of the chromadb Python package inside a pipenv environment. I could not get the message despite everything being the same (package version, collection directory path, collection name and embedding function) when I used version 0.4.24. I've asserted the values of all parameters and they are always correct. The message appears when I upgrade to the latest version 0.5.0.