Previously, using Kubeflow Pipelines SDK v1, the status of a pipeline could be inferred during pipeline execution by passing an Argo placeholder, {{workflow.status}}
, to the component, as shown below:
import kfp.dsl as dsl
component_1 = dsl.ContainerOp(
name='An example component',
image='eu.gcr.io/.../my-component-img',
arguments=[
'python3', 'main.py',
'--status', "{{workflow.status}}"
]
)
This placeholder would take the value Succeeded
or Failed
when passed to the component. One use-case for this would be to send a failure-warning to eg. Slack, in combination with dsl.ExitHandler
.
However, when using Pipeline SDK version 2, kfp.v2
, together with Vertex AI to compile and run the pipeline the Argo placeholders no longer work, as described by this open issue. Because of this, I would need another way to check the status of the pipeline within the component. I was thinking I could use the kfp.Client
class, but I'm assuming this won't work using Vertex AI, since there is no "host" really. Also, there seems to be supported placeholders for to pass the run id (dsl.PIPELINE_JOB_ID_PLACEHOLDER
) as a placeholder, as per this SO post, but I can't find anything around status
.
Any ideas how to get the status of a pipeline run within a component, running on Vertex AI?
Each pipeline run is automatically logged to Google Logging, and so are also the failed pipeline runs. The error logs also contain information about the pipeline and the component that failed.
We can use this information to monitor our logs and set up an alert via email for example.
The logs for our Vertex AI Pipeline runs we get with the following filter
resource.type=”aiplatform.googleapis.com/PipelineJob” severity=(ERROR OR CRITICAL OR ALERT OR EMERGENCY)
Based on those logs you can set up log-based alerts https://cloud.google.com/logging/docs/alerting/log-based-alerts. Notifications via email, Slack, SMS, and many more are possible.
source: https://medium.com/google-cloud/google-vertex-ai-the-easiest-way-to-run-ml-pipelines-3a41c5ed153