In a large amount of Tupfiles I use extensive pipelining, e.g.
: input |> < %f command1 | command2 > %o |> output
The problem with this is that Tup calls system
which executes these :-rules in sh
, which doesn't support set -o pipefail
. As a result, if only command1
fails, tup will still mark this as a success because it had a 0 exit code. This is highly problematic.
I know of two solutions to this, neither of which is ideal.
a. I could abandon pipelining and instead do:
: input |> < %f command1 > %o |> intermediate
: intermediate |> < %f command2 > %o |> output
This will work, but would require rewriting a bunch of rules tediously, and more importantly will use significantly more disk space and disk writes every time there is an update.
b) I can wrap every command in bash
like:
: input |> bash -c 'set -o pipefail && < %f command1 | command2 > %o' |> output
This seems slightly better as it involves fewer rewrites, and avoids the io, but is still very cumbersome. It also requires escaping any '
in my :-rules.
Ideally there would be Tup configs that could just specify what shell / interpreter to use to read :-rules. Ideally, there would also be a configuration for a common prefix, so all scripts could be run with set -o pipefail &&
or anything else I want. As far as I know this is not immediately possible. A wrapper around system
would need to be written whenever tup invokes a rule. However, maybe I've missed some aspect of Tup that would allow something more elegant than the two solutions proposed.
Edit:
While the call to system did allow me to "inject" pipefail into calls to system. I miss-stated the fact that programs are run using system. With some help from the mailing list it turns out that they are actually run using execle
. Below is the code I used to do the interposition in case anyone wants to accomplish the same thing.
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <unistd.h>
int execle(const char* path, const char* arg0, ...) {
/* We're going to interpose this function, modify the arguments if we need
* to, and then convert it into a call to execve. Due to a weirdness in the
* consts of the api, we need to discard a const qualifier on the
* characters in the arguments. The call is `int execve(const char*
* filename, char* const argv[], char* const envp[]);` but it should
* probably be `int execve(const char* filename, const char* const argv[],
* char* const envp[]);` at the very least, e.g. arguments shouldn't be
* modified. These aren't actually modified by the call, so in order to
* avoid the inefficiency of copying the strings into memory we don't need,
* we just do this unsafely and compile with `-Wno-discarded-qualifiers`.
* */
// Count the number of variable arguments for malloc
unsigned int num_args;
va_list ap;
va_start(ap, arg0);
if (arg0) {
num_args = 1;
while(va_arg(ap, const char*)) {
num_args++;
}
} else {
num_args = 0;
}
char* const* env = va_arg(ap, char* const*); // Also grab env
va_end(ap);
// Test for specific tup execle call
va_start(ap, arg0);
int intercept = num_args == 4
&& strcmp(path, "/bin/sh") == 0
&& strcmp(arg0, "/bin/sh") == 0
&& strcmp(va_arg(ap, const char*), "-e") == 0
&& strcmp(va_arg(ap, const char*), "-c") == 0;
va_end(ap);
// Switch on whether to intercept the call, or pass it on
/*const*/ char** args;
if (intercept) { // We want to switch to bash with pipefail enabled
args = malloc(7 * sizeof(args));
path = "/bin/bash";
args[0] = "/bin/bash";
args[1] = "-e";
args[2] = "-o";
args[3] = "pipefail";
args[4] = "-c";
va_start(ap, arg0);
va_arg(ap, const char*);
va_arg(ap, const char*);
args[5] = va_arg(ap, const char*); // command
va_end(ap);
args[6] = NULL;
} else { // Just copy args into a null terminated array for execve
args = malloc((num_args + 1) * sizeof(*args));
char** ref = args;
if (arg0) {
*ref++ = arg0;
const char* arg;
va_start(ap, arg0);
while ((arg = va_arg(ap, const char*))) {
*ref++ = arg;
}
va_end(ap);
}
*ref = NULL;
}
int error_code = execve(path, args, env);
free(args);
return error_code;
}
You could implement your own system
as
switch(pid = fork()) {
case 0:
// Modify command to prepend "set -o pipefail &&" to it.
execl("/bin/bash", "bash", "-c", command, (char *) 0);
case -1: // handle fork error
default:
waitpid(pid, ...);
}
and LD_PRELOAD
that system
implementation into your tup
process.
If you don't feel like doing low-level process management, you can interpose system
to just wrap the command in bash -c "set -o pipefail && "
and escape quotes, then invoke the original system
. See this article on library interposition.