I am trying to use the Generative Search (RAG) from weaviate following this guide. My objective is to add data from a Law School Handbook to en LLM to be able to make Generative searchs on it.
I have created a sandbow instance in WCS:
import weaviate
auth_config = weaviate.AuthApiKey(api_key="YOUR-WEAVIATE-API-KEY")
client = weaviate.Client(
url="https://sandbox-handbook-reader-rmvhz5aq.weaviate.network",
auth_client_secret=auth_config,
additional_headers = {
"X-OpenAI-Api-Key": "YOUR-OPENAI-KEY" # Replace with your inference API key
}
)
I have then downloaded, chunked my hanbook text and created a Class that I named "HandbookChunk". For the chunk, I have chosen 150 tokens per chunk with a 25 tokens overlap in the beginning of each chunk. (I get in total 781 chunks).
I then imported the data to Weaviate and can see it in my instance:
client.query.aggregate("HandbookChunk").with_meta_count().do()
It returns:
{'data': {'Aggregate': {'HandbookChunk': [{'meta': {'count': 781}}]}}}
When doing a group_task
in the query, if I don't set the parameter .with_limit()
, I get an connection Error:
response = (
client.query
.get("HandbookChunk", ["chunk_index", "chunk"])
.with_generate(
grouped_task="Provide jurisprudence on surveillance cases"
)
.with_near_text({
"concepts": ["law cases involving surveillance"]
})
#.with_limit(5)
).do()
print(json.dumps(response, indent=2))
I get "error": "connection to: OpenAI API failed with status: 400 error: -19577 is less than the minimum of 1 - 'max_tokens'":
{
"data": {
"Get": {
"HandbookChunk": [
{
"_additional": {
"generate": {
"error": "connection to: OpenAI API failed with status: 400 error: -19577 is less than the minimum of 1 - 'max_tokens'",
"groupedResult": null
}
},
"chunk": "reintegration into society. [...] are available. Consequently,",
"chunk_index": 546
},
{
"_additional": {
"generate": null
},
"chunk": "activities which are [...] as an activity which is",
"chunk_index": 226
},...
]
}
}
}
Is it because the output would be too long (over the max_tokens
)?
That is right!
When you remove the .with_limit(5) you are probably maxing out this model's max_token.
You can reduce the context you are providing or find a model that will accept more tokens on it's input.
Check here for other generative modules we have in Weaviate: https://weaviate.io/developers/weaviate/modules/reader-generator-modules