google-cloud-platformgoogle-cloud-runvpc

Cloud Run Direct VPC Egress Connection Timeout Issue


I have a Compute Engine VM that runs MySQL and Redis, and a Cloud Run service that connects to those databases. I recently switched from using a VPC connector to direct VPC egress and I started getting intermittent connection issues:

  File "uvloop/loop.pyx", line 2039, in create_connection
  File "uvloop/loop.pyx", line 2016, in uvloop.loop.Loop.create_connection
TimeoutError: [Errno 110] Connection timed out

My Cloud Run service is running a Python API, and it uses aiomysql to connect to MySQL and redis to connect to Redis. In both cases, I am occasionally getting the above connection timeout error.

I've noticed that this tends to happen when the cloud run service is spinning up more containers. Is it possible that this is caused by the latency in reserving static internal ip addresses for the cloud run services?

I also have a firewall policy that allows access to the VM based on network tags — is it possible that there's a delay in the firewall picking up the network tags?


Solution

  • Circling back here.. it turns out, the issue was that the newly reserved internal IP addresses weren't getting the network tags fast enough and/or the firewall policy was not detecting them fast enough. Modifying the firewall rule to allow the entire CIDR ip range so that it no longer relies on the network tags did the trick. Haven't had a single hiccup since that change.