bazelstarlark

Declaring both directory and inner files in a Bazel rule


I want a Bazel rule which creates a directory, takes a bunch of files, and creates an ISO out of that directory (the ISO is the intended output here, the other files are just byproducts).

# Declare root directory
root_directory = ctx.actions.declare_directory("root_directory")

# Call mkdir to create root directory
create_root_directory_arguments = ctx.actions.args()
create_root_directory_arguments.add(root_directory.path)
ctx.actions.run(
    outputs = [root_directory],
    arguments = [create_root_directory_arguments],
    executable = "mkdir",
)

...

# Declare and collect inner files
copy_output_files = []
for image_file in image_files:
    copy_output_file = ctx.actions.declare_file(image_file_path)
    copy_output_files.append(copy_output_file)

# Call cp to copy the inner files into the root directory
copy_arguments = ctx.actions.args()
copy_arguments.add("-r")
copy_arguments.add("-t", root_directory.path)
for image_file in image_files:
    copy_arguments.add(image_file.path)
ctx.actions.run(
    inputs = [root_directory],
    outputs = copy_output_files,
    arguments = [copy_arguments],
    executable = "cp",
)

...

The problem is that if I declare the directory abc, I cannot declare the file abc/def - I get an exception.

What is the intended solution here? Should I declare the folder alone? Just the files? How would I achieve the File object for the run call other than declaring?


Solution

  • Simple answer: write a shell script and make all of what you've showed one action, with the ISO file as the only output. Bazel will create the parent directories of each of your outputs before running your action, you don't need to do that. Don't use ctx.actions.declare_directory.

    I have two common approaches for temporary files:

    Directory artifacts (the thing created by ctx.actions.declare_directory) are an advanced feature. I've used and created a lot of Bazel rules, and I've never found it useful. It's only useful for situations like a tool that produces a directory with 10 files named based on hashes of the input file contents, and it expects those same 10 files with the same names that will differ every build to be passed into a later step. It looks useful for temporary files and packaging at first, but it doesn't really work for those.

    For reference, I usually manage files for packaging via pkg_tar, with custom rules that translate tarballs directly into the final outputs. Anything that extracts a tarball and then re-packages it tends to run into issues with file permissions (actions runs unprivileged so it can't make a file 0400 root:root for example) and reproducibility (timestamps and sorting), but you can do that if you want.