machine-learninglangchainembeddingchromadbollama

Prevent create embeddings if folder already present ChromaDB


This is my first attempt in RAG application. I am trying to do Q&A using LLM. I will paste code below which is working fine. My problem is that the code to generate embedding run every time I run python code. Is there any way to run it only once or to check if the embeddings folder is empty, than run that code.

from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_community import embeddings
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain.output_parsers import PydanticOutputParser
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings

model_local = ChatOllama(model="codellama:7b")

loader = TextLoader("remedy.txt")
raw_doc = loader.load()

# Split the text file content into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
splitted_docs = text_splitter.split_documents(raw_doc)

# Use embedding function to store them in vector db
ollamaEmbeddings = embeddings.ollama.OllamaEmbeddings(model="nomic-embed-text")


# used chroma vector db to store the data
vectorstore = Chroma.from_documents(
    documents=splitted_docs,
    embedding=ollamaEmbeddings,
    persist_directory="./vector/my_data",
)

# This will write the data to local
retriever = vectorstore.as_retriever()

# 4. After RAG
print("After RAG\n")
after_rag_template = """
    Answer the question based only on the following context:
    {context}
    Question {question}?
"""
after_rag_prompt = ChatPromptTemplate.from_template(after_rag_template)
after_rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | after_rag_prompt
    | model_local
    | StrOutputParser()
)
print(after_rag_chain.invoke("What are Home Remedy for Common Cold?"))

Solution

  • Every time you run this python script, you are supplying a persistent directory which will store the embeddings on disk at the specified directory. You are passing in the same chunked documents. And you are defining the embeddings model. So really with every script run, the vector database will perform that same action.

    When you want to load the persisted database from disk, you instantiate the Chroma object, specifying the persisted directory and the embedding model as so:

    # load from disk
    db3 = Chroma(persist_directory="./vector/my_data", embedding_function= ollamaEmbeddings)
    docs = db3.similarity_search(query)
    print(docs[0].page_content)