google-colaboratorypalm

in Colab: Using PaLM to query an Index generated from GoogleDrive documents...help integrating PaLM


I just started to learn coding 3 days ago, so I am quite new. I am trying to build a GPT to answer questions for me referencing documents contained in a Google Drive folder. I am using LlamaIndex to index the Google Drive.

When I run the query for the following code, it just runs and doesnt stop. I am not sure what the issue is:

!pip install -q google-generativeai
!pip install docx2txt
!pip install pypdf
!pip install llama-index

import pprint
import google.generativeai as palm
from llama_index import (
    VectorStoreIndex, 
    SimpleDirectoryReader, 
    StorageContext, 
    ServiceContext,
    load_index_from_storage   
)
from llama_index.llms import PaLM

palm_api_key = "key"
palm.configure(api_key=palm_api_key)

model = PaLM(api_key=palm_api_key)

models = [m for m in palm.list_models() if 'generateText' in m.supported_generation_methods]
model = models[0].name
print(model)

#returns: models/text-bison-001

from google.colab import drive
drive.mount('/content/drive')

documents = SimpleDirectoryReader('/content/drive/My Drive/Trajan GPT Corpus').load_data()
index = VectorStoreIndex.from_documents(documents)

#this is running REALLY slowly
query_engine = index.as_query_engine()
response = query_engine.query("What is trajan systems?")
print(response)

I read this: https://gpt-index.readthedocs.io/en/stable/getting_started/customization.html

and tried to insert this line as it directs: service_context = ServiceContext.from_defaults(llm=PaLM())

But that did not work


Solution

  • If you're looking to build a question/answer type system for a corpus of documents, there's a couple of fully runnable guides on the PaLM site.

    The Document Q&A guide uses only the PaLM API, and does the "lookup" part in Python with Numpy directly, or if you want to simplify a little, there's also a guide that uses ChromaDB as a database. Each of these has a link to run the guide in Colab directly.