tensorflowgcloudtensorflow-servinggoogle-cloud-ml-engine

tensorflow serving prediction not working with object detection pets example


I was trying to do predictions on gcloud ml-engine with the tensorflow object detection pets example, but it doesn't work.

I created a checkpoint using this example: https://github.com/tensorflow/models/blob/master/object_detection/g3doc/running_pets.md

With the help of the tensorflow team, I was able to create an saved_model to upload to the gcloud ml-engine: https://github.com/tensorflow/models/issues/1811

Now, I can upload the model to the gcloud ml-engine. But unfortunately, I'm not able to do correct prediction requests to the model. Everytime I try a prediction, I get the same error:

Input instances are not in JSON format.

I was trying to do online predictions with

gcloud ml-engine predict --model od_test --version v1 --json-instances prediction_test.json

and I was trying to do batch predictions with

gcloud ml-engine jobs submit prediction "prediction7" 
    --model od_test 
    --version v1 
    --data-format TEXT 
    --input-paths gs://ml_engine_test1/prediction_test.json 
    --output-path gs://ml_engine_test1/prediction_output 
    --region europe-west1

I want to submit a list of images as unit8-matrices, so for the export I was using the input type image_tensor.

As stated in the documentation here: https://cloud.google.com/ml-engine/docs/concepts/prediction-overview#prediction_input_data, the input json should have a particular format. But nether the format for online predictions, nor the format for batch predictions is working. My latest tests were a single file with the content:

{"instances": [{"values": [1, 2, 3, 4], "key": 1}]}

and the content:

{"images": [0.0, 0.3, 0.1], "key": 3}
{"images": [0.0, 0.7, 0.1], "key": 2}

none of them were working. Can anyone help me, how the input format should be?

edit

The error from the batch processing is

{
    insertId:  "1a26yhdg2wpxvg6"   
    jsonPayload: {
        @type:  "type.googleapis.com/google.cloud.ml.api.v1beta1.PredictionLogEntry"    
        error_detail: {
            detail:  "No JSON object could be decoded"     
            input_snippet:  "Input snippet is unavailable."     
        }
        message:  "No JSON object could be decoded"    
    }
    logName:  "projects/tensorflow-test-1-168615/logs/worker"   
    payload: {
        @type:  "type.googleapis.com/google.cloud.ml.api.v1beta1.PredictionLogEntry"    
        error_detail: {
            detail:  "No JSON object could be decoded"     
            input_snippet:  "Input snippet is unavailable."     
        }
        message:  "No JSON object could be decoded"    
    }
    receiveTimestamp:  "2017-07-28T12:31:23.377623911Z"   
    resource: {
        labels: {
            job_id:  "prediction10"     
            project_id:  "tensorflow-test-1-168615"     
            task_name:  ""     
        }
        type:  "ml_job"    
    }
    severity:  "ERROR"   
    timestamp:  "2017-07-28T12:31:23.377623911Z"   
}

Solution

  • The model your exported accepts the input like following for prediction if you use gcloud to submit your requests for gcloud ml-engine local predict as well as batch prediction.

    {"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}
    {"inputs": [[[232, 242, 219], [242, 240, 239], [242, 240, 239], [242, 242, 239], [242, 240, 123]]]}
    ...
    

    If you're sending the requests directly to the service (i.e., not using gcloud), the body of the request would look like:

    {"instances": [{"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}]}
    {"instances": [{"inputs": [[[232, 242, 219], [242, 240, 239], [242, 240, 239], [242, 242, 239], [242, 240, 123]]]}]}
    

    The input tensor name should be "inputs" because it is what we've specified in the signature.inputs.The value of each JSON object is a 3-D array as you can tell from here. The outer dimension is None to support batched-input. No "instances" is needed (unless you use the http API directly). Note that you cannot specify "key" in the input unless you modify the graph to include an extra placeholder and output it untouched using tf.identity.

    Also as mentioned in the github issue, the online service may not work due to the large memory the model requires. We are working on that.