I didn't find the answer on the documentation page: https://cloud.google.com/bigtable/docs/connection-pools
Hopefully someone knows here, thanks!
The default channel pool size to 2 x CPUs: https://github.com/googleapis/java-bigtable/blob/9a0f9838c9a96ae1108da36f79bf1a4cdf4b5749/google-cloud-bigtable/src/main/java/com/google/cloud/bigtable/data/v2/stub/EnhancedBigtableStubSettings.java#L243
This approach can sometimes backfire in linux contains (& k8s by extension), which can present the # of cpus as 1.
Initializing each channel is quite expensive The channel pool uses round robin for channel selection, so the first 2x cpu request will spike in latency. So I would recommend to send some fake read requests on application startup to warm up the channel pool. Also if a channel is idle for awhile, it will automatically disconnect, causing the next request to pay the initialization cost. So for low qps applications, I would recommend to size the pool to a single connection. Also, channels are required to reconnect every hour, so you might see periodic latency spikes (they are a bit more evident for low qps applications). To mitigate this, I would recommend to enable the auto channel refresh:
BigtableDataSettings.newBuilder()
.setProjectId("...")
.setInstanceId("...")
.setRefreshingChannel(true);
Please note that this feature is experimental and might change in the future.
There are 2 benefits to using multiple channels:
Each gRPC channel can have at most 100 outstanding requests at a time. Having more connections, will allow for greater throughput.
Each gRPC channel will be connected to a different bigtable frontend, which will spread the load of accepting requests amongst more machines.
A colleague of mine found that the default setting of 2x CPUs for the channel pool size should fit high qps usecases fairly well.
Finally, I think that adding a way to easily print the default settings is an excellent feature request and I will add a separate issue for it.