Mobile app -----> Firebase Function
| ------------------> External API
| waiting...
| CPU billed?
| <------------------ External API Responds
Mobile app <----- Firebase Function Responds
In my 2nd gen Firebase Function, I am making a request to an external API. Naturally, that API will take some time to respond. My Firebase Function will return with a response after that API request completes.
My question is, will I be billed for CPU time while waiting on that external API? The CPU is not actively being used, so it would make sense if I was not billed. I asked Gemini and it strongly agrees with this point of view in this Gemini chat: https://g.co/gemini/share/d37b5aed9c41 (please scroll down to the latest question there, the beginning of the chat is about something else). Gemini thinks request-based billing means just this.
However, I'm not convinced and the graph here in Billable Instance Time makes me think that I'm actually billed for the CPU during that idle time where I do nothing on the CPU except waiting for the API to respond.
Does anybody have a definite answer to this? Am I billed the same as someone doing CPU-intensive task for merely waiting for a network response?
will I be billed for CPU time while waiting on that external API
Yes.
The CPU is not actively being used, so it would make sense if I was not billed
You are billed for as long as a function is in the middle of invocation, from the time it starts to the time it returns a response. The busy-ness of the CPU is never the issue - what matters is that the CPU is allocated and available to perform work during an invocation. The only time you are not billed for CPU on a given server instance is if there are no active requests for that instance (eventually allowing it to scale down).
Gen 2 functions improve on this by allowing multiple concurrent requests, so that they all share the same total billing time. You will want to read this documentation to better understand how it works. Specifically:
When setting concurrency higher than one request at a time, multiple requests can share the allocated CPU and memory of an instance.
See also: