postgresqldockergoogle-cloud-sqlgoogle-cloud-runhasura

Hasura Cloud Run and Cloud SQL pool settings/database errors


We're using Cloud Run and Cloud SQL on our current setup with Hasura 2.0.9 (GCP)

Cloud Run is setup to run minimum 5 instances and max 150 and about ~80-90 instances are running on average.

Cloud SQL is setup to accept up to 500 connections (4vCPU and 15GB of RAM)

Average requests per second on Cloud Run are ~350

I'm getting errors on Cloud SQL:

db=postgres,user=postgres FATAL: remaining connection slots are reserved for non-replication superuser connections

and:

db=postgres,user=postgres FATAL: sorry, too many clients already

And 500/503 errors on Cloud Run:

severity: "ERROR". <--- 500
textPayload: "The request failed because the instance could not start successfully."
severity: "ERROR" <--- 503
textPayload: "The request failed because either the HTTP response was malformed or connection to the instance had an error."

This is the databases.yaml

- name: default
  kind: postgres
  configuration:
    connection_info:
      database_url:
        from_env: HASURA_GRAPHQL_DATABASE_URL
      isolation_level: read-committed
      pool_settings:
        connection_lifetime: 600
        idle_timeout: 180
        max_connections: 400
        retries: 1
      use_prepared_statements: true
  tables: "!include default/tables/tables.yaml"
  functions: "!include default/functions/functions.yaml"

Is the above yaml ok to use or I should limit the max_connections down to 150 instances / 500 (db connection limit) = ~2 max_connections? Right now on the database monitoring I see that the connections are getting beyond the 400 max_connections setting of the pool and hit the 500 connections limit of Cloud SQL.

I cannot find a sweet spot so the infrastructure won't fail. I even tried to use pgpool (I've removed pool_settings from databases.yaml) but it won't get better. I've been trying combinations for the last couple days with no luck.

Any help is much appreciated.


Solution

  • If you ran into Cloud Run -> Cloud SQL connection limits which is 100 connections according to this quota page.

    A good upgrade would be to swap connector based database connections, with a VPC Network based approach.

    There is an extensive guide put together by @guilaume about VPC Connectors with Cloud Run.

    Cloud SQL with private IP only: the Good, the Bad and the Ugly