pythonlistresultsetchromadb

Chromadb: Why do results of collection.query() and collection.get() differ?


I am using Chromadb Version 0.5.23

print(collection.query(...))

produces something like:

{'ids': [['id1', 'id2', 'id3']], 'embeddings': None, 'documents': None, 'uris': None, 'data': None, 'metadatas': None, 'distances': [[0.2003527583406446, 0.21832232106694371, 0.23420078419011314]], 'included': [<IncludeEnum.distances: 'distances'>]}

This is a dict with lists of lists.

print(collection.get(...))

produces something like:

{'ids': ['id1', 'id2', 'id3'], 'embeddings': None, 'documents': ['Text1', 'Text2', 'Text3'], 'uris': None, 'data': None, 'metadatas': None, 'included': [<IncludeEnum.documents: 'documents'>]}

A dict with lists.

Is there a special reason for this behavior, is it a bug, a feature?

I would expect that the results have the same format. More I do not see a reason for lists containing a single element only.


Solution

  • Looks like a typing error helped to find the answer myself!

    collection.query(query_texts = ['first query', 'second query'])

    allows to enter multiple querytexts, which lead to multiple results. Therefore the results contains

    {'ids': [[results for first query], [results for second query] ...}

    On the other hand

    collection.get()

    returns a single list of documents to return.