pythonchromadbollama

VannaAI (with Ollama and ChromaDB) sample program fails at training model step


I'm starting to test VannaAI, and I'm running a sample program based on Generating SQL for Postgres using Ollama, ChromaDB:

from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore

class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)

vn = MyVanna(config={'model': 'mistral'})

vn.connect_to_postgres(host='<ofuscated>', dbname='<ofuscated>', user='<ofuscated>', password='<ofuscated>', port='<ofuscated>')

# The information schema query may need some tweaking depending on your database. This is a good starting point.
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")

# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
print(plan)

# If you like the plan, then uncomment this and run it to train
print("Training starts")
vn.train(plan=plan)
print("Training ends")

When I run it I get:

...
Training starts
C:\Users\bodoque\.cache\chroma\onnx_models\all-MiniLM-L6-v2\onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:05<00:00, 16.1MiB/s]
Add of existing embedding ID: 9064de8e-3c0c-4f3b-a02b-215dff373009-doc

Process finished with exit code -1073741819 (0xC0000005)

The print("Training ends") is never reached, so I understand vn.train(plan=plan) breaks.


Solution

  • Downgrading to chromadb 0.5.3 (which comes with crhoma-hnswlib 0.7.3) solved the issue.

    Ref: https://github.com/chroma-core/chroma/issues/2534