We use Google Cloud Run to wrap an analysis developed in R behind a web API. For this, we have a small Fastify app that launches an R script and uploads the results to Google Cloud Storage. The process' stdout and stderr are written to a file and are also uploaded at the end of the analysis.
However, we sometimes run into issues when a process takes longer to execute than expected. In these cases, we fail to upload anything and it's difficult to debug, because stdout and stderr are "lost" on the instance. The only thing we see in the Cloud Run logs is this message
The request has been terminated because it has reached the maximum request timeout
Is there a recommended way to handle a request timeout?
In App Engine there used to be a descriptive error: DeadlineExceededError for Python and DeadlineExceededException for Java.
We currently evaluate the following approach
This feels a little complicated so any feedback very appreciated.
I am not sure, if the documentation was there I posted this question, but there are a few things that can help in handling the timeout.
SIGTERM
: 10 seconds before the server shuts down, the process receives a SIGTERM
signal, followed by SIGKILL
on shutdown (see Cloud Run: Container runtime contract | Forced termination).CLOUD_RUN_TIMEOUT_SECONDS
is set to the configured request timeout1.With this combination, we can dynamically handle the timeout.
1 At the time of writing this does not seem to be documented. I found out by printing all environment variables in a 'debug'-endpoint. However, I found it in functions-framework-java where it is used to set a TimoutFilter