pythonmachine-learninggoogle-cloud-vertex-aikubeflow-pipelinesvertex-ai-pipeline

Import custom prediction routine to vertex AI pipeline


I have created a custom prediction routine on Vertex AI, uploaded the model, and am able to generate predictions with it through the UI. Now, I would like to incorporate this into a Vertex AI Pipeline, to run batch predictions after a data generation step. I am using the Kubeflow Pipelines SDK.

To do so, I am thinking of using the ModelBatchPredictOp prebuilt component. For this to work, I need to import the model into the pipeline, for example with an importer component. HOWEVER, the importer component requires an artifact URI, which my model does not have because it uses a custom container. It is baked into the container image; it is not sitting in GCS.

So I tried, though I did not really expect it to work, writing a quick custom importer component, which returns the model object, but I get a type mismatch. See sample code below:

from helper import data_component
from kfp.dsl import component, pipeline, Output, Model
from google_cloud_pipeline_components.v1.batch_predict_job import ModelBatchPredictOp


@component(packages_to_install=["google-cloud-aiplatform"])
def custom_importer(model: Output[Model]):
    from google.cloud import aiplatform
    return aiplatform.Model(model_name="model-id")


@pipeline(name="prediction-pipeline")
def pipeline():
    data_task = data_component()
    
    importer_task = custom_importer()

    batch_predict_op = ModelBatchPredictOp(
        job_display_name="batch_predict_job",
        model=importer_task.output,
        gcs_source_uris=data_task.outputs["dataset"],
        gcs_destination_output_uri_prefix="bucket",
        instances_format="csv",
        predictions_format="jsonl",
        starting_replica_count=1,
        max_replica_count=1,
    )

The ModelBatchPredictOp doesn't like the input type for arg model: InconsistentTypeException: Incompatible argument passed to the input 'model' of component 'model-batch-predict': Argument type 'system.Model@0.0.1' is incompatible with the input type 'google.VertexModel@0.0.1'

How can I incorporate batch prediction from a custom prediction routine into a Vertex AI Pipeline?


Solution

  • If your model is in the Model Registry, I think you should use ModelGetOp operator.

    v1.model.ModelGetOp(
        model_name: str, 
        project: str = '{{$.pipeline_google_cloud_project_id}}', 
        location: str = 'us-central1')¶
    

    You will need to enter the model resource name in the model_name argument. You can get it programmatically like this:

    model_name = 'my_model'
    model_resource_name = aiplatform.Model.list(filter=f'display_name="{model_name}"')[0].resource_name
    

    You can use GetVertexModelOp if you're using earlier versions of google-cloud-pipeline-components/kfp.