I'm in the process of writing a test suite for some lexing/parsing and it would be much cleaner if I could drop test input/output files in a directory and have dune generate OCaml test cases for each of these during a step in compilation.
I figured I could use dune for this, very much inspired by this documentation page (Preprocessors and PPXs), but I'm struggling at getting it to work. I've essentially come to 2 dead ends:
An alias rule that would execute a script padding each of the test files seemingly wouldn't work:
(tests
(names lexer)
(libraries llvmlexer llvmparser ounit2))
(rule
(alias runtest)
(deps (glob_files %{workspace_root}/**/*.ll))
(action (system "./preprocess-lexer.sh '%{input-file}'")))
As it errors with:
File "test/dune", line 9, characters 41-54:
9 | (action (system "./preprocess-lexer.sh '%{input-file}'")))
^^^^^^^^^^^^^
Error: %{input-file} isn't allowed in this position.
I'm very confused by this. Is this a matter of executing the action once for all files? If so is it possible to execute it once for each dependency?
Indeed, dune doesn't support wildcard rules at the time of writing. It has, however, very limited support for it tailored for preprocessing so that you can specify a rule of the following form *.ml -> *.pp.ml
, exactly with these suffixes, e.g.,
(library
(name foo)
(preprocess (action (run cpp %{input-file}))))
And then if you have a file bar.ml
#define X 1
let x = X
It will be preprocessed to a bar.pp.ml
file, which will be dropped in the build directory and used instead of bar.ml
. This is how this mechanism works and it is designed to work only with the OCaml source files. And if it suits you, you just need to fix the suffixes, i.e., you need to rename your .ll
files to .ml
and specify the preprocess stanza that uses you preprocessor instead of cpp
that I have used in the example.
The mechanism described above is called "preprocessing via user actions", which should be confused with the more general (and also using actions) custom rule stanza. The common use of this stanza is to define the rules of the form,
(rule
(target foo.data)
(deps foo.data.src)
(action
(with-stdin-from %{deps}
(with-stdout-to %{target}
(chdir %{workspace_root}
(run ./tools/my_rewriter.sh))))))
where ./tools/my_rewriter.sh
will receive the contents of foo.data.src
in stdin and everything it prints will be redirected to foo.data
. (Note that ./tools/my_rewriter.sh
is the path from the top-level of your project). You can't specify a wildcard, like
(target *.data)
(deps *.data.src)
and expect it to be called for each file with the matching suffixes. Again, at the time of writing such a mechanism is not implemented in dune. You have, however two options as workarounds.
You can either rely on the OCaml Syntax and produce the dune file that contains such a rule replicated for each *.data.src
in the folder. I wouldn't personally recommend this, as the status of the OCaml Syntax support is not clear and it might misbehave in general.
Alternatively, you can add an extra stage to your build process, e.g., a ./configure
script that will generate the dune file with all these rules.
You can also write them manually, of course :)
You can use glob_files
and then change your action so that it takes a set of files and produce a set of files, e.g., using GNU parallel,
(rule
(deps (glob_files *.data.in)
(action (run parallel cp {} {.} ::: %{deps})))
And this rule for each <foo>.data.in
will produce <foo>.data
. (Of course, you can write your own for loop, instead of using parallel).
The caveat with this approach is that since this rule doesn't specify targets, then all produced files will be eventually deleted by dune. And the problem is that unlike deps
the targets
stanza doesn't accept glob_files
, which perfectly makes sense, as the targets are not expected to exist at the time of rule application.
For the rescue, we have the new directory-targets
. To enable it, you need the following in your dune-project
(the lang shall be greater than or equal to 3.0):
(lang dune 3.0)
(using directory-targets 0.1)
Now you can put the test input data files that you would like to preprocess in the same folder as your test driver. In this case, I use *.data.src
as the input files and test_foo.ml
(rule
(deps (glob_files *.data.src))
(target (dir data))
(action
(progn
(run mkdir -p data)
(run parallel cp {} data/{.} ::: %{deps}))))
(test
(name test_foo)
(deps data))
The (run parallel cp {} data/{.} ::: %{deps})
will call cp <file>.data.src data/<file>.data
for each <file>
matching *.data.src
. You can substitute it with your command which takes the set of input files and populates it with the preprocessed files. This command could even be implemented in OCaml, just specify ./path/to/your/tool.exe
as the command and dune will build it automatically from ./path/to/your/tool.ml
.
In this setup, whenever you change an input *.data.src
file, or any other dependency of the test, dune test
will rebuild the data folder and correctly rerun the tests.
For the sake of completeness, here is the contents of my test_foo.ml
file,
open Printf
let () =
Sys.readdir "data" |> Array.iter @@ fun file ->
if Filename.check_suffix file ".data"
then printf "testing with %s\n%!" file
And here's a sample directory structure,
$ tree
.
|-- bar.ml
|-- dune
|-- dune-project
`-- test
|-- bar.data.src
|-- dune
|-- foo.data.src
`-- test_foo.ml
1 directory, 7 files
Feel free to poke me if you want to get a fully working example.