I have a web application by ReactJS and Nodejs. This application calls OpenAI APIs.
Previously, when a user launches a request in the frontend, we send a request to the endpoint in our backend, call createChatCompletion
of https://github.com/openai/openai-node in the backend, and returns the result to the frontend. Note that the server of our frontend and the server of our backend are separate and not in the same location; users are everywhere in the world.
We just realized that we can also request directly https://api.openai.com/v1/chat/completions
in the frontend as follows:
const res = await fetch("https://api.openai.com/v1/chat/completions", {
method: 'POST',
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${API_KEY}`
},
body: JSON.stringify({
model: model,
messages: [{ role: "user", content: prompt }]
})
})
At the moment, our pain-point is the time from sending a request by a user to seeing the result in the application is too long. From this perspective of speed, does anyone know which approach is better and why?
One thing to keep in mind is that everything in your front end is essentially public. In this case, if you do the call directly from the browser, it's trivial for users to capture your api key.
Removing the call to your server likely won't make a significant difference anyway; AI is rather slow. A better solution may be to use the streaming API (and also stream from your backend to your frontend) so the users can see the response as it's generated.