dockerdocker-registryprefect

How can I run Prefect flows using a Docker Worker pool and a local custom Docker image?


I'm new to Prefect and have a limited experience with Docker and I'm trying to deploy some existing Prefect flows so that they run inside a Docker container running an image I built myself, but apparently the Docker worker pool isn't able to find the image locally on my machine.

Here are the steps I've been trying:

  1. Start a prefect server: Here, from the root of my repo, I just run prefect server start and the server successfully starts.
  2. Using the server's web UI (available on 127.0.0.1:4200), I create a new Work Pool and select the "Docker" Infrastructure type. I named it "local-docker-workpool" and in the setup options, I'm using 'host' as 'Network Mode'. All other options are left unchanged
  3. From a new terminal, I run prefect worker start --pool "local-docker-workpool" and I get a success message saying "Worker 'DockerWorker ac4a180b-b881-4a77-8713-26c70102204c' started!"
  4. From a third terminal, I deploy my workflow running this command: prefect --no-prompt deploy --all --prefect-file prefect_dd/prefect_local.yaml. Once again, I get a success message. This time, it says: "Deployment '(dbt) Build hourly models/build-hourly-dbt-models-dev-2.0' successfully created with id 'fc3cea0c-9939-4c1f-b973-c6fcef1769d1'." Here's the content for my prefect_dd/prefect_local.yaml file:
name: mydataproject
prefect-version: 2.14.20

definitions:
  work_pools:
    dev_workpool: &dev_workpool
      name: local-docker-workpool
      work_queue_name: default
      job_variables:
        image: "prefect-docker-guide-image:latest"
        pull_policy: "IfNotPresent"
        env:
          SNOWFLAKE_ACCOUNT: "{{ $SNOWFLAKE_ACCOUNT }}"
          SNOWFLAKE_USERNAME: "{{ $SNOWFLAKE_USERNAME }}"
          SNOWFLAKE_PASSWORD: "{{ $SNOWFLAKE_PASSWORD }}"
          PREFECT_API_URL: "http://host.docker.internal:4200/api"
  tags:
    dev_tags: &dev_tags |-
      dev

deployments:
  # Use this deployment to test flow locally
  - name: build-hourly-dbt-models-dev-2.0
    version:
    description: Build dbt models scheduled for every hour
    entrypoint: prefect_dd/flows/dbt/run_hourly_dbt_jobs.py:run_hourly_dbt_jobs
    parameters: {}
    work_pool: *dev_workpool
    tags:
      - *dev_tags
    schedule:
      cron: 30 5-17 * * *
      timezone: America/Los_Angeles
      day_or: true
      active: true

Up until now, everything seemed alright. The problem happens when I trigger the deployment execution. From what I understand, the Work Pool tries to create the container from the prefect-docker-guide-image:latest image but fails to find it (even though it exists on my system, as I have successfully built it myself). Here's the error message I'm getting:

docker.errors.ImageNotFound: 404 Client Error for http+docker://localhost/v1.45/images/create?tag=latest&fromImage=prefect-docker-guide-image: Not Found ("pull access denied for prefect-docker-guide-image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")

I don't get why it's failing to find an image I'm 100% sure exists on my local system. Just for reference, here's the output of the docker images command:

output of the docker images command

I believe I wouldn't have this problem if I pushed my docker image to Docker registry, but I don't want to do that for now, as I plan to use it for dev purposes and I dont want to keep pushing images to registries every time I have to change it.

Can someone please help me here?

I'm using MacOS Sonoma on an Apple M3 processor


Solution

  • It turns out I needed to add an 'image_pull_policy' job variable and set it to "Never". This way the work pool doesn't try to pull the image from any remote Docker registries.

    Updated prefect_dd/prefect_local.yaml file:

    definitions:
      work_pools:
        dev_workpool: &dev_workpool
          name: local-docker-workpool
          work_queue_name: default
          job_variables:
            image: "prefect-docker-guide-image:latest"
            image_pull_policy: "Never"
    

    This was discussed in this Github thread.