I just started to learn coding 3 days ago, so I am quite new. I am trying to build a GPT to answer questions for me referencing documents contained in a Google Drive folder. I am using LlamaIndex to index the Google Drive.
When I run the query for the following code, it just runs and doesnt stop. I am not sure what the issue is:
!pip install -q google-generativeai
!pip install docx2txt
!pip install pypdf
!pip install llama-index
import pprint
import google.generativeai as palm
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
ServiceContext,
load_index_from_storage
)
from llama_index.llms import PaLM
palm_api_key = "key"
palm.configure(api_key=palm_api_key)
model = PaLM(api_key=palm_api_key)
models = [m for m in palm.list_models() if 'generateText' in m.supported_generation_methods]
model = models[0].name
print(model)
#returns: models/text-bison-001
from google.colab import drive
drive.mount('/content/drive')
documents = SimpleDirectoryReader('/content/drive/My Drive/Trajan GPT Corpus').load_data()
index = VectorStoreIndex.from_documents(documents)
#this is running REALLY slowly
query_engine = index.as_query_engine()
response = query_engine.query("What is trajan systems?")
print(response)
I read this: https://gpt-index.readthedocs.io/en/stable/getting_started/customization.html
and tried to insert this line as it directs:
service_context = ServiceContext.from_defaults(llm=PaLM())
But that did not work
If you're looking to build a question/answer type system for a corpus of documents, there's a couple of fully runnable guides on the PaLM site.
The Document Q&A guide uses only the PaLM API, and does the "lookup" part in Python with Numpy directly, or if you want to simplify a little, there's also a guide that uses ChromaDB as a database. Each of these has a link to run the guide in Colab directly.