pythondockerfilegoogle-cloud-dataflowgoogle-cloud-buildgoogle-artifact-registry

Install Artifact Registry Python package from Dockerfile with Cloud Build


I have a python package located in my Artifact Registry repository.

My Dataflow Flex Template is packaged within a Docker image with the following command:

gcloud builds submit --tag $CONTAINER_IMAGE .

Since developers are constantly changing the source code of the pipeline, this command is often run from their computers to rebuild the image.

Here is my Dockerfile:

FROM gcr.io/dataflow-templates-base/python311-template-launcher-base

ARG WORKDIR=/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}

ENV PYTHONPATH ${WORKDIR}
ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="${WORKDIR}/setup.py"
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/main.py"

RUN pip install --no-cache-dir -U pip && \
    pip install --no-cache-dir -U keyrings.google-artifactregistry-auth

RUN pip install --no-cache-dir -U --index-url=https://europe-west9-python.pkg.dev/sample-project/python-repo/ mypackage

COPY . ${WORKDIR}/
    
ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"]

I get the following error:

ERROR: No matching distribution found for mypackage
error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1

I guess the Cloud Build process doesn't have the access rights. I'm a bit confused on how to get them from a Dockerfile.

An article I found was mentionning the use of a Service Account key file read by the Docker process, but I would like to avoid that. Could I use the Service Account impersonation feature?


Solution

  • Solution was simpler than expected. pip had trouble finding the package because of misuse of the package locations flags which are --index-url and --extra-url. I used the former whereas I should have used the latter.

    I need --extra-url to point to my Artifact Registry repository to find my package and --index-url to point to the default PyPI to find dependencies needed by my package.

    In short, here's what I just added to my Dockerfile:

    RUN pip install keyrings.google-artifactregistry-auth
    RUN pip install --extra-index-url https://europe-west9-python.pkg.dev/sample-project/python-repo/simple/ mypackage==0.3.0