I'm starting to test VannaAI, and I'm running a sample program based on Generating SQL for Postgres using Ollama, ChromaDB:
from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore
class MyVanna(ChromaDB_VectorStore, Ollama):
def __init__(self, config=None):
ChromaDB_VectorStore.__init__(self, config=config)
Ollama.__init__(self, config=config)
vn = MyVanna(config={'model': 'mistral'})
vn.connect_to_postgres(host='<ofuscated>', dbname='<ofuscated>', user='<ofuscated>', password='<ofuscated>', port='<ofuscated>')
# The information schema query may need some tweaking depending on your database. This is a good starting point.
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")
# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
print(plan)
# If you like the plan, then uncomment this and run it to train
print("Training starts")
vn.train(plan=plan)
print("Training ends")
When I run it I get:
...
Training starts
C:\Users\bodoque\.cache\chroma\onnx_models\all-MiniLM-L6-v2\onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:05<00:00, 16.1MiB/s]
Add of existing embedding ID: 9064de8e-3c0c-4f3b-a02b-215dff373009-doc
Process finished with exit code -1073741819 (0xC0000005)
The print("Training ends")
is never reached, so I understand vn.train(plan=plan)
breaks.
Downgrading to chromadb 0.5.3 (which comes with crhoma-hnswlib 0.7.3) solved the issue.