openai-apiweaviate

RAG - "error": "connection to: OpenAI API failed with status: 400 error: -19577 is less than the minimum of 1 - 'max_tokens'"


I am trying to use the Generative Search (RAG) from weaviate following this guide. My objective is to add data from a Law School Handbook to en LLM to be able to make Generative searchs on it.

I have created a sandbow instance in WCS:

import weaviate

auth_config = weaviate.AuthApiKey(api_key="YOUR-WEAVIATE-API-KEY")

client = weaviate.Client(
   url="https://sandbox-handbook-reader-rmvhz5aq.weaviate.network",
   auth_client_secret=auth_config,
   additional_headers = {
       "X-OpenAI-Api-Key": "YOUR-OPENAI-KEY"  # Replace with your inference API key
   }
)

I have then downloaded, chunked my hanbook text and created a Class that I named "HandbookChunk". For the chunk, I have chosen 150 tokens per chunk with a 25 tokens overlap in the beginning of each chunk. (I get in total 781 chunks).

I then imported the data to Weaviate and can see it in my instance:

client.query.aggregate("HandbookChunk").with_meta_count().do()

It returns:

{'data': {'Aggregate': {'HandbookChunk': [{'meta': {'count': 781}}]}}}

When doing a group_task in the query, if I don't set the parameter .with_limit(), I get an connection Error:

response = (
    client.query
    .get("HandbookChunk", ["chunk_index", "chunk"])
    .with_generate(
        grouped_task="Provide jurisprudence on surveillance cases"
    )
    .with_near_text({
        "concepts": ["law cases involving surveillance"]
    })
    #.with_limit(5)
).do()

print(json.dumps(response, indent=2))

I get "error": "connection to: OpenAI API failed with status: 400 error: -19577 is less than the minimum of 1 - 'max_tokens'":

{
  "data": {
    "Get": {
      "HandbookChunk": [
        {
          "_additional": {
            "generate": {
              "error": "connection to: OpenAI API failed with status: 400 error: -19577 is less than the minimum of 1 - 'max_tokens'",
              "groupedResult": null
            }
          },
          "chunk": "reintegration into society. [...] are available. Consequently,",
          "chunk_index": 546
        },
        {
          "_additional": {
            "generate": null
          },
          "chunk": "activities which are [...] as an activity which is",
          "chunk_index": 226
        },...
      ]
    }
  }
}

Is it because the output would be too long (over the max_tokens)?


Solution

  • That is right!

    When you remove the .with_limit(5) you are probably maxing out this model's max_token.

    You can reduce the context you are providing or find a model that will accept more tokens on it's input.

    Check here for other generative modules we have in Weaviate: https://weaviate.io/developers/weaviate/modules/reader-generator-modules