i have an interface for chatgpt which usually works, but randomly seems to come up with a KeyError that i currently cannot resolve or avoid. it uses the solara library to create a web page from the Python code in a jupyter notebook.
try:
global gMemory #memory of chat stored as string
gptResponse.value = "==GENERATING-RESPONSE==" #output
url = "https://api.openai.com/v1/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer [REDACTED]",}
data = {"model": "text-davinci-003", "prompt": ((gMemory + "\nUSER REQUEST: " + text).strip()), "max_tokens": 3000, "temperature": 1.0,}
response = requests.post(url, headers=headers, json=data)
out = str(response.json()['choices'][0]['text'].strip())
gMemory += "USER REQUEST: " + text + out + "\n" #text is the users request, out is the openai response
gptResponse.value = "" #clears output
gptResponse.value = gptResponse.value + out
except KeyError:
time.sleep(10)
gptResponse.value = "==ERROR-//-ENTER-A-NEW-REQUEST-;-OR-REFRESH=="
time.sleep(10)
submitButton.value = False #allows requests to be submitted again
the above code sends the request to openai, along with the conversation history (combined into one string). It then updates the history with the new message and updates the output value for the gui
i have put this all into a try except, which is able to catch the error. however, the error still persists when attempting to send any new requests, irrelevant of how long i wait to send a request or if the request is different. it only works after refreshing the page, which loses all history and restarts the program.
By adding a block of code to display response.json() without any formatting, it shows the error that is present. the error being that the prompt and response can only add up to 4097 tokens; but the total is too high. so ill have to reduce the size of response allowed. less than ideal, but it is a solution.
tldr: "max_tokens": 3000
needs changing to a lower value, as the token count becomes too high as the conversation continues and requires more space.
max tokens
is actually just the maximum for the response. but you still need to consider the size of the request.