google-cloud-platformgoogle-gemini

Don't understand resource exhaustion using Gemini


I am currently developing a web application using Google Gemini and I am running into a resource issue I do not seem to understand. Using the free tier, I should be able to send 30 requests per minute, which I can do just fine running the code below for n=15 twice. However, if I run the same code for n>=20, I receive 429 Resource exhausted errors.

I feel like I should be able to do 30 requests. Am I perhaps sending request too fast by sending 20 or more at a time?

async def async_rate_ideas():
    get_responses = [rate_ideas(7) for i in range(n)]
    return await asyncio.gather(*get_responses)

Solution

  • Apparently 30 RPM does not mean being able to send 30 simultaneously. Requests need to be somewhat spread out over time. As such, adding a small delay between each request fixes my problem.