grpchttp2latencygrpc-java

How many streams will be used by a single Bi-directional streaming gRPC call?


We are noticing performance issues when trying to use the bi-directional stream API call to stream data at 2000-4000 updates/second. We enabled debug logging and see that streamId is same for all the outbound and inbound messages.

Question: Does one RPC bi-directional streaming call use only one stream to send data either from client or server? If yes, does that mean, 2000-4000 updates will be sequentially streamed instead of concurrently?

What we tried:

We are using a single bi-directional stream RPC call to publish updates at the rate of 2000-4000 updates/sec

What we did not expect:

Latency issues (~50ms) even when the client and server was running on the same host


Solution

  • Messages sent in a stream form a sequence and are ordered. From the gRPC Core Concepts:

    Bidirectional streaming RPCs where both sides send a sequence of messages using a read-write stream. ... The order of messages in each stream is preserved.

    Bidi streams are implemented using a single HTTP/2 stream, which is necessary for the communication to be between a single client and server.

    Not only are bidi streams serial over the network, the grpc-java callbacks into the application are limited to a single thread for that stream. This guarantees ordered delivery and prevents your callback from needing to be thread-safe.

    stream data at 2000-4000 updates/second

    The current gRPC benchmarks, specifically "Streaming secure throughput QPS (8 core client to 8 core server)", show gRPC Java able to stream at 600 kmsg/s, using TLS between separate hosts. However, those messages are small. The message size could influence the performance you see, and your own application processing will definitely impact the available performance.

    Latency issues (~50ms) even when the client and server was running on the same host

    The "Streaming secure ping pong median latency" benchmark shows around 125 µs latency on an unloaded stream. 50 ms is high enough that it can really only be because the stream is overloaded, especially if you aren't observing flow control (ServerCallStreamObserver and ClientCallStreamObserver.isReady()) causing a long queue, or because of application processing.