tensorflowtensorflow-servingservingtfx

tensorflow/serving with top n logits to return


I'm currently dealing with the challenge to serve my tensorflow models in a scalable way. As far as I know the recommended solution is to use the standard TensorFlow ModelServer. Common requirements are pretty well handled by this - but I want more. I want to decrease the transfered amount of data by parsing a parameter like "limit" to define the top n logits + probabilites to return.

During my research I identified the following solutions:

1) Create a more advanced SignatureDef during model building.

2) Customize the basic tensorflow/serving project with the mentioned functionality.

3) Serve the model with the standard Tensorflow Modelserver and build a postprocessing service to restructure resp. filter the result in the predefined way.

Can someone more experienced than me go into some details regarding my question? - codesnippets or links would be awesome.

Thanks in advance.


Solution

  • Your solution number 3,

    "Serve the model with the standard Tensorflow Modelserver and build a postprocessing service to restructure resp. filter the result in the predefined way."

    should be the best one.

    Links and Code Snippets: If we consider the example of MNIST using TF Serving, the link for Saved Model is, https://github.com/tensorflow/serving/blob/87e32bb386f156fe208df633c1a7f489b57464e1/tensorflow_serving/example/mnist_saved_model.py,

    and the link for Client code is https://github.com/tensorflow/serving/blob/87e32bb386f156fe208df633c1a7f489b57464e1/tensorflow_serving/example/mnist_client.py.

    If we want values of top-n predictions, we can tweak the code of the function, _create_rpc_callback in the Client file as shown below.

    def _create_rpc_callback(label, result_counter):
      """Creates RPC callback function.
    
      Args:
        label: The correct label for the predicted example.
        result_counter: Counter for the prediction result.
      Returns:
        The callback function.
      """
      def _callback(result_future):
        """Callback function.
    
        Calculates the statistics for the prediction result.
    
        Args:
          result_future: Result future of the RPC.
        """
        exception = result_future.exception()
        if exception:
          result_counter.inc_error()
          print(exception)
        else:
          sys.stdout.write('.')
          sys.stdout.flush()
          response = numpy.array(result_future.result().outputs['scores'].float_val)
          print('Top 4 responses = ', response[0:4]) 
    

    The print statement in the last line will print Top-4 Predictions.