tensorflowgoogle-cloud-mlgoogle-predictiongoogle-ai-platform

Create Version Failed. Bad model detected with error: "Error loading the model" - AI Platform Prediction


I created a model through AI Platform UI that uses a global endpoint. I am trying to deploy a basic tensorflow 1.15.0 model I exported using the Saved Model builder. When I try to deploy this model I get a Create Version Failed. Bad model detected with error: "Error loading the model" error in the UI and the I see the following in the logs:

ERROR:root:Failed to import GA GRPC module. This is OK if the runtime version is 1.x

Failure: Could not reach metadata service: Internal Server Error.

ERROR:root:Command '['/tools/google-cloud-sdk/bin/gsutil', '-o', 'GoogleCompute:service_account=default', 'cp', '-R', 'gs://cml-365057443918-1608667078774578/models/xsqr_global/v6/7349456410861999293/model/*', '/tmp/model/0001']' returned non-zero exit status 1.

ERROR:root:Error loading model: 'generator' object has no attribute 'next'

ERROR:root:Error loading the model

What is strange is that the gcloud ai-platform local predict works correctly with this exported model, and I can deploy this exact same model on a regional endpoint with no issues. It only gives this error if I try to use a global endpoint model. But I need the global endpoint because I plan on using a custom prediction routine (if I can get this basic model working first).

The logs seem to suggest an issue with copying the model from storage? I've tried giving various IAM roles additional viewer permissions, but I still get the same errors.

Thanks for the help.


Solution

  • I think it's the same issue as https://issuetracker.google.com/issues/175316320

    The comment in the issue says the fix is now rolling out.