tensorflowgrpcamazon-sagemakertensorflow-serving

Does AWS Sagemaker supports gRPC prediction requests?


I deployed a Sagemaker's Tensorflow model from an estimator in local mode and when trying to call the Tensorflow Serving (TFS) predict endpoint using gRPC I get the error:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"

Im doing the gRPC request exactly as in this blog post:

import grpc from tensorflow.compat.v1 
import make_tensor_protofrom tensorflow_serving.apis 
import predict_pb2from tensorflow_serving.apis 
import prediction_service_pb2_grpc

grpc_port = 9000 # Tried also with other ports such as 8500
request = predict_pb2.PredictRequest()
request.model_spec.name = 'model'

request.model_spec.signature_name = 'serving_default'
request.inputs['input_tensor'].CopyFrom(make_tensor_proto(instance))
    options = [
        ('grpc.enable_http_proxy', 0),
        ('grpc.max_send_message_length', MAX_GRPC_MESSAGE_LENGTH),
        ('grpc.max_receive_message_length', MAX_GRPC_MESSAGE_LENGTH)
    ]

channel = grpc.insecure_channel(f'0.0.0.0:{grpc_port}', options=options)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

result_future = stub.Predict.future(request, 30)  

output_tensor_proto = result_future.result().outputs['predictions']
output_shape = [dim.size for dim in output_tensor_proto.tensor_shape.dim]

output_np = np.array(output_tensor_proto.float_val).reshape(output_shape)

prediction_json = {'predictions': output_np.tolist()}

Looking at the Sagemaker's docker container where TFS is running, I see in the logs that the rest endpoint is exported/exposed, but not the gRPC one, although it seems to be running:

ensorflow_serving/model_servers/server.cc:417] Running gRPC ModelServer at 0.0.0.0:9000 ...

Unlike for gRPC, in the container logs I can see the rest endpoint is exported:

tensorflow_serving/model_servers/server.cc:438] Exporting HTTP/REST API at:localhost:8501 ...

Does Sagemaker TFS containers even support gRPC? How can one make a gRPC TFS prediction request using Sagemaker?


Solution

  • SageMaker endpoints are REST endpoints. You can however make gRPC connections within the container. You cannot make the InvokeEndpoint API call via gRPC.

    If you are using the SageMaker TensorFlow container, you need to pass an inference.py script that contains the logic to make the gRPC request to TFS.

    Kindly see this example inference.py script that makes a gRPC prediction against TensorFlow Serving.