google-cloud-vertex-aikfp

Error when importing sklearn in pipeline component


When I run this simple pipeline (in GCP's Vertex AI Workbench) I get an error:

ModuleNotFoundError: No module named 'sklearn'

Here is my code:

from kfp.v2 import compiler
from kfp.v2.dsl import pipeline, component
from google.cloud import aiplatform

@component(
    packages_to_install=["sklearn"],
    base_image="python:3.9",
)
def test_sklearn():
    import sklearn

@pipeline(
    pipeline_root=PIPELINE_ROOT,
    name="sklearn-pipeline",
)
def pipeline():
    test_sklearn()

compiler.Compiler().compile(pipeline_func=pipeline, package_path="sklearn_pipeline.json")

job = aiplatform.PipelineJob(
    display_name=PIPELINE_DISPLAY_NAME,
    template_path="sklearn_pipeline.json",
    pipeline_root=PIPELINE_ROOT,
    location=REGION
)

job.run(service_account=SERVICE_ACCOUNT)

What do I do wrong? :)


Solution

  • It seems that the package name sklearn does not work after a version upgrade.You need to change the value of packages_to_install from "sklearn" to "scikit-learn" in the @component block.