I have the following situation:
find-deps
is an external program that is very quick to run, and discovers dependency information, similar to ghc -M
. Its output is some file deps
.compile
is an external program that is very slow to run; unlike ghc --make
, it is very slow even if none of the inputs have changed.So the idea is to add a Shake rule that runs find-deps
to produce deps
, parse it into a list of files srcs
, and then the compilation rule would need srcs
to ensure that compile
is only re-run if any of the sources discovered by find-deps
has changed.
The tricky part is that find-deps
needs to alwaysRerun
, to discover newly-depended-on source files. So now if the compile
rule depends on deps
to get the list of files, it will also alwaysRerun
. The standard solution would be to use an oracle: we can add an oracle that need
s deps
and parses it into a list of files, and then the compile
rule would first ask for that list of source files, and only need
them. So there is no alwaysRerun
on the need
chain of compile
.
However, in my case, I am not writing a particular Shakefile. Instead, I am writing a library of reusable Rules
that users can use to make their own main Shakefile. So I'd need to package it up as something like
myRules :: FilePath -> Rules ()
myRules dir = do
dir </> "deps" %> \depFile -> do
alwaysRerun
cmd_ (Cwd dir) "find-deps" ["-o", depFile]
dir </> "exe" %> \exeFile -> do
srcs <- askOracle $ Sources dir
need srcs
cmd_ (Cwd dir) "compile" ["-o", exeFile]
But where would I put the addOracle $ \Sources dir -> ...
part that would need [dir </> "deps"]
and parse it and return a list of source files? I can't put it in rules
, because then two invocations of rules
with different directories will try to install an oracle handler two times for the same type. And I can't make dir
be part of the oracle question type, because it is a term-level variable so I can't lift it into a Symbol
index of the query.
And that leaves me with something super-lame like having a includeThisOnlyOnce :: Rules ()
that the user has to remember to include exactly once in their Shakefile.
So my question is:
compile
when no source files have changed) without involving an oracle?Sources
oracle only to the context of each individual invocation of myRules someDir
?Is there a way to track dependencies without involving an oracle?
Yes - if the output of find-deps
doesn't change at all then it won't rebuild compile
. You can achieve that by specifying a Change
value such as ChangeModtimeAndDigest
, but that is a global setting. Alternatively, you can put the output of find-deps
somewhere such as foo.deps.out
and then call copyFileChanged "foo.deps.out" "foo.deps"
, which won't update the timestamp if the file hasn't changed.
Is there a way to separate oracles of the same type?
Not easily and immediately, although I can see why its useful. I can think of two potential routes to solve it:
addOracleIdempotent
which ignored any errors about adding the same oracle repeatedly. That's a moderately easy change to Shake (essentially set a flag in Rules
to ignore duplicates).dir
to the type-level and ensuring each oracle has a different type. It probably makes your API more complicated and requires type magic.Of all these solutions, I'd use copyFileChanged
, as its simple and local.