google-cloud-platformgoogle-cloud-storagegoogle-cloud-bigtable

Google Bigtable: What is the default connection pool size?


I didn't find the answer on the documentation page: https://cloud.google.com/bigtable/docs/connection-pools

Hopefully someone knows here, thanks!


Solution

  • The default channel pool size to 2 x CPUs: https://github.com/googleapis/java-bigtable/blob/9a0f9838c9a96ae1108da36f79bf1a4cdf4b5749/google-cloud-bigtable/src/main/java/com/google/cloud/bigtable/data/v2/stub/EnhancedBigtableStubSettings.java#L243

    This approach can sometimes backfire in linux contains (& k8s by extension), which can present the # of cpus as 1.

    Initializing each channel is quite expensive The channel pool uses round robin for channel selection, so the first 2x cpu request will spike in latency. So I would recommend to send some fake read requests on application startup to warm up the channel pool. Also if a channel is idle for awhile, it will automatically disconnect, causing the next request to pay the initialization cost. So for low qps applications, I would recommend to size the pool to a single connection. Also, channels are required to reconnect every hour, so you might see periodic latency spikes (they are a bit more evident for low qps applications). To mitigate this, I would recommend to enable the auto channel refresh:

    BigtableDataSettings.newBuilder()
        .setProjectId("...")
        .setInstanceId("...")
        .setRefreshingChannel(true);
    

    Please note that this feature is experimental and might change in the future.

    There are 2 benefits to using multiple channels:

    1. Each gRPC channel can have at most 100 outstanding requests at a time. Having more connections, will allow for greater throughput.

    2. Each gRPC channel will be connected to a different bigtable frontend, which will spread the load of accepting requests amongst more machines.

    A colleague of mine found that the default setting of 2x CPUs for the channel pool size should fit high qps usecases fairly well.

    Finally, I think that adding a way to easily print the default settings is an excellent feature request and I will add a separate issue for it.