bazelbazel-rulesrules-oci

Reference environment variables in Bazel oci_image entrypoint


I'm trying to build a Docker image using Bazel's rules_oci like this:

oci_image(
name = "move_data_image",
base = "@python_base",
entrypoint = [
    "/opt/python/dataflow/move_data",
    "--worker_image",
    "$WORKER_IMAGE",
    "--max_num_workers",
    "$MAX_NUM_WORKERS",
    "--runner",
    "DataflowRunner",
],
env = {
    "WORKER_IMAGE": "",
    "MAX_NUM_WORKERS": -1,
},
tars = [":move_data_layer"],
)

The idea was to to use env vars as arguments to the program so that it can be changed for different executions. I was able to achieve this behavior using Dockerfile:

ENV WORKER_IMAGE ""
ENV MAX_NUM_WORKERS -1

ENTRYPOINT python src/process_data.py --worker_image=$WORKER_IMAGE
          --max_num_workers=$MAX_NUM_WORKERS 
          --runner=DataflowRunner

But I'm struggling to do this with Bazel. For the Bazel code snippet, the env variables are taken as literal strings so I would like errors like:

error: argument --max_num_workers: invalid int value: '$MAX_NUM_WORKERS'

Since rules_oci is kind of new, I wasn't able to find a lot of documentation on the correct syntax and usage. I'm wondering if this kind of use case is supported? Thanks in advance!


Solution

  • This is the equivalento f the "shell form" of ENTRYPOINT which you're looking for (you must have a shell in your base image though):

    oci_image(
    name = "move_data_image",
    base = "@python_base",
    entrypoint = [
        "/bin/sh", "-c", " ".join([
            "/opt/python/dataflow/move_data",
            "--worker_image",
            "$WORKER_IMAGE",
            "--max_num_workers",
            "$MAX_NUM_WORKERS",
            "--runner",
            "DataflowRunner",
        ]),
    ],
    env = {
        "WORKER_IMAGE": "",
        "MAX_NUM_WORKERS": -1,
    },
    tars = [":move_data_layer"],
    )
    

    What you attempted is equivalent to ENTRYPOINT ["/opt/python/dataflow/move_data", "--worker_image", "$WORKER_IMAGE", ...] in a Dockerfile, which won't work either. You need something that's going to read environment variables.

    Instead of using sh -c, you could modify your Python code to read the environment variables directly. You could also write a wrapper (shell script, Python, or something else) that would read the environment variables and build up the command line.