timeouttensorflowgrpcserving

TensorFlow Server close the connection within client timeout


We have used TensorFlow Serving to load the model and implement the Java gRPC client.

Normal it works for small data. But if we request with larger batch size and data is almost 1~2M, the server closes the connection and throw the internal error quickly.

We have also open an issue to track this in https://github.com/tensorflow/serving/issues/284.

Job aborted due to stage failure: Task 47 in stage 7.0 failed 4 times, most recent failure: Lost task 47.3 in stage 7.0 (TID 5349, xxx)
io.grpc.StatusRuntimeException: INTERNAL: HTTP/2 error code: INTERNAL_ERROR
Received Rst Stream
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:230)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:211)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:144)
at tensorflow.serving.PredictionServiceGrpc$PredictionServiceBlockingStub.predict(PredictionServiceGrpc.java:160)

......

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:189)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:91)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:219)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:

Solution

  • As can be seen in the above issue, this was caused by the message exceeding the default maximum message size of 4 MiB. The receiver of larger messages needs to explicitly permit larger sizes, or sender send smaller messages.

    gRPC is fine with larger messages (even 100s MBs), but applications frequently aren't. The max message size is in place to permit "large" messages only in applications that are prepared to accept them.