google-cloud-platformgoogle-cloud-mlgcp-ai-platform-notebook

Custom code containers for google cloud-ml for inference


I am aware that it is possible to deploy custom containers for training jobs on google cloud and I have been able to get the same running using command.

gcloud ai-platform jobs submit training infer name --region some_region --master-image-uri=path/to/docker/image --config config.yaml

The training job was completed successfully and the model was successfully obtained, Now I want to use this model for inference, but the issue is a part of my code has system level dependencies, so I have to make some modification into the architecture in order to get it running all the time. This was the reason to have a custom container for the training job in the first place.

The documentation is only available for the training part and the inference part, (if possible) with custom containers has not been explored to the best of my knowledge.

The training part documentation is available on this link

My question is, is it possible to deploy custom containers for inference purposes on google cloud-ml?


Solution

  • This response refers to using Vertex AI Prediction, the newest platform for ML on GCP.

    Suppose you wrote the model artifacts out to cloud storage from your training job.

    The next step is to create the custom container and push to a registry, by following something like what is described here:

    https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements

    This section describes how you pass the model artifact directory to the custom container to be used for interence:

    https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#artifacts

    You will also need to create an endpoint in order to deploy the model:

    https://cloud.google.com/vertex-ai/docs/predictions/deploy-model-api#aiplatform_deploy_model_custom_trained_model_sample-gcloud

    Finally, you would use gcloud ai endpoints deploy-model ... to deploy the model to the endpoint:

    https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/deploy-model