pythoncontinuous-integrationgithub-actionsgoogle-cloud-vertex-aikfp

How can I push KFP pipeline to VertexAI registry with GitHub Actions


This is the code i'm running:

pipeline_func = time_series_pipeline
pipeline_filename = 'time_series_pipeline.yaml'

print("compiling pipeline...")
compiler.Compiler().compile(
    pipeline_func=pipeline_func, package_path=pipeline_filename
)

parser = argparse.ArgumentParser(description='GCP access token.')
parser.add_argument('access_token', type=str, help='GCP access token')
args = parser.parse_args()
access_token = args.access_token

creds, project = google.auth.default()
auth_req = google.auth.transport.requests.Request()
creds.refresh(auth_req)

impersenated_creds = google.auth.impersonated_credentials.Credentials(
    source_credentials=environ['GOOGLE_APPLICATION_CREDENTIALS'],
    target_principal='gh-action@[project].iam.gserviceaccount.com'
)

registry = RegistryClient(
    host='https://europe-west3-kfp.pkg.dev/project_id/vertex-pipeline-registry',
    # auth=impersenated_creds
)

print("uploading pipeline...")
templateName, versionName = registry.upload_pipeline(
  file_name=pipeline_filename,
  tags=["v1", "latest"],
  extra_headers={"description":"This is an example pipeline template."})

Below is the error GitHub Actions returns:

Traceback (most recent call last):
compiling pipeline...
  File "/home/runner/work/project/project/vertex-pipeline/pipeline.py", line 112, in <module>
    creds.refresh(auth_req)
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/google/auth/external_account.py", line 401, in refresh
    self._impersonated_credentials.refresh(request)
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/google/auth/impersonated_credentials.py", line 235, in refresh
    self._update_token(request)
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/google/auth/impersonated_credentials.py", line 267, in _update_token
    self.token, self.expiry = _make_iam_token_request(
                              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/google/auth/impersonated_credentials.py", line 83, in _make_iam_token_request
    raise exceptions.RefreshError(_REFRESH_ERROR, response_body)
google.auth.exceptions.RefreshError: ('Unable to acquire impersonated credentials', '{\n  "error": {\n    "code": 400,\n    "message": "Request contains an invalid argument.",\n    "status": "INVALID_ARGUMENT"\n  }\n}\n')

gh-action service account has the following permissions:

AI Platform Admin
Artifact Registry Administrator
BigQuery Admin
BigQuery Data Editor
Cloud Functions Developer
Cloud Functions Invoker
Cloud Run Invoker
Editor
Eventarc Event Receiver
Pub/Sub Admin
Secret Manager Secret Accessor
Service Account Token Creator
Storage Admin
Storage Object User
Storage Transfer Admin
Storage Transfer agent
Storage Transfer Service service agent
Vertex AI administrator

I have tried passing the service account token to the RegistryClient() auth parameter, passing credentials file to auth_file parameter, tried impersonating different account but the same error comes up. Passing a token with ApiAuth() to auth parameter returns different error related to permissions documented below. I'm not sure if I need to add a specific permission that is missing from the ones documented above:

  File "/home/runner/work/project/project/vertex-pipeline/pipeline.py", line 114, in <module>
    templateName, versionName = registry.upload_pipeline(
                                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/kfp/registry/registry_client.py", line 352, in upload_pipeline
    response.raise_for_status()
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://europe-west3-kfp.pkg.dev/***/vertex-pipeline-registry

Github Action Workflow looks like this:

env:
  PROJECT_ID: ${{ secrets.PROJECT_ID }}
  TF_BACKEND: ${{ secrets.TF_BACKEND }}
  TF_VAR_gh_token: ${{ secrets.TF_VAR_gh_token }}
  TF_IN_AUTOMATION: "true"

jobs:
  terraform-apply:
    runs-on: 'ubuntu-latest'
    permissions:
      contents: 'read'
      id-token: 'write'
      issues: 'write'
      pull-requests: 'write'

    steps:
      - name: Checkout repository
        uses: 'actions/checkout@v3'

      - id: 'auth'
        uses: 'google-github-actions/auth@v2'
        with:
          token_format: 'access_token'
          workload_identity_provider: ${{ secrets.WIF_PROVIDER_NAME }}
          service_account: ${{ secrets.SERVICE_ACCOUNT_EMAIL }}
      
      - uses: 'google-github-actions/setup-gcloud@v2'
        with:
          install_components: "beta,terraform-tools,gsutil,core"

      - uses: 'actions/setup-python@v5'
        if: steps.changes.outputs.vertex == 'true'
        with:
          python-version: 3.12

      - name: Install python dependencies
        if: steps.changes.outputs.vertex == 'true'
        run: |
          echo "Changed files in vertex-pipeline"
          python -m pip install --upgrade pip
          pip install -r vertex-pipeline/requirements.txt

      - name: Run vertex pipeline
        if: steps.changes.outputs.vertex == 'true'
        run: python vertex-pipeline/pipeline.py ${{steps.auth.outputs.access_token}}

Edit: tried to use id_token and auth_token for authentication with ApiAuth(), but didn't work. Got Unable to acquire impersonated credentials.


Solution

  • Managed to make the authentication work. I gave the permissions to a service account, different from the one authenticated in GitHub Actions, and impersonating it with python like so:

    creds, pid = google.auth.default()
    
    impersonated_creds = impersonated_credentials.Credentials(
        source_credentials=creds,
        target_principal=pipeline_service_account,
        delegates=[],
        target_scopes=['https://www.googleapis.com/auth/cloud-platform'],
        lifetime=300,
    )
    
    request = google.auth.transport.requests.Request()
    impersonated_creds.refresh(request)
    
    registry = RegistryClient(
        host=config.KFP_TEMPLATE_REGISTRY,
        auth=ApiAuth(impersonated_creds.token)
    )