bashtup

set pipefail for commands run by tup


In a large amount of Tupfiles I use extensive pipelining, e.g.

: input |> < %f command1 | command2 > %o |> output

The problem with this is that Tup calls system which executes these :-rules in sh, which doesn't support set -o pipefail. As a result, if only command1 fails, tup will still mark this as a success because it had a 0 exit code. This is highly problematic.

I know of two solutions to this, neither of which is ideal.

a. I could abandon pipelining and instead do:

: input |> < %f command1 > %o |> intermediate
: intermediate |> < %f command2 > %o |> output

This will work, but would require rewriting a bunch of rules tediously, and more importantly will use significantly more disk space and disk writes every time there is an update.

b) I can wrap every command in bash like:

: input |> bash -c 'set -o pipefail && < %f command1 | command2 > %o' |> output

This seems slightly better as it involves fewer rewrites, and avoids the io, but is still very cumbersome. It also requires escaping any ' in my :-rules.

Ideally there would be Tup configs that could just specify what shell / interpreter to use to read :-rules. Ideally, there would also be a configuration for a common prefix, so all scripts could be run with set -o pipefail && or anything else I want. As far as I know this is not immediately possible. A wrapper around system would need to be written whenever tup invokes a rule. However, maybe I've missed some aspect of Tup that would allow something more elegant than the two solutions proposed.

Edit: While the call to system did allow me to "inject" pipefail into calls to system. I miss-stated the fact that programs are run using system. With some help from the mailing list it turns out that they are actually run using execle. Below is the code I used to do the interposition in case anyone wants to accomplish the same thing.

Solution

#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <unistd.h>

int execle(const char* path, const char* arg0, ...) {
    /* We're going to interpose this function, modify the arguments if we need
     * to, and then convert it into a call to execve. Due to a weirdness in the
     * consts of the api, we need to discard a const qualifier on the
     * characters in the arguments. The call is `int execve(const char*
     * filename, char* const argv[], char* const envp[]);` but it should
     * probably be `int execve(const char* filename, const char* const argv[],
     * char* const envp[]);` at the very least, e.g. arguments shouldn't be
     * modified. These aren't actually modified by the call, so in order to
     * avoid the inefficiency of copying the strings into memory we don't need,
     * we just do this unsafely and compile with `-Wno-discarded-qualifiers`.
     * */

    // Count the number of variable arguments for malloc
    unsigned int num_args;
    va_list ap;
    va_start(ap, arg0);
    if (arg0) {
        num_args = 1;
        while(va_arg(ap, const char*)) {
            num_args++;
        }
    } else {
        num_args = 0;
    }
    char* const* env = va_arg(ap, char* const*); // Also grab env
    va_end(ap);

    // Test for specific tup execle call
    va_start(ap, arg0);
    int intercept = num_args == 4
        && strcmp(path, "/bin/sh") == 0
        && strcmp(arg0, "/bin/sh") == 0
        && strcmp(va_arg(ap, const char*), "-e") == 0
        && strcmp(va_arg(ap, const char*), "-c") == 0;
    va_end(ap);

    // Switch on whether to intercept the call, or pass it on
    /*const*/ char** args;
    if (intercept) { // We want to switch to bash with pipefail enabled
        args = malloc(7 * sizeof(args));
        path = "/bin/bash";
        args[0] = "/bin/bash";
        args[1] = "-e";
        args[2] = "-o";
        args[3] = "pipefail";
        args[4] = "-c";

        va_start(ap, arg0);
        va_arg(ap, const char*);
        va_arg(ap, const char*);
        args[5] = va_arg(ap, const char*); // command
        va_end(ap);

        args[6] = NULL;

    } else { // Just copy args into a null terminated array for execve
        args = malloc((num_args + 1) * sizeof(*args));

        char** ref = args;
        if (arg0) {
            *ref++ = arg0;
            const char* arg;
            va_start(ap, arg0);
            while ((arg = va_arg(ap, const char*))) {
                *ref++ = arg;
            }
            va_end(ap);
        }
        *ref = NULL;

    }

    int error_code = execve(path, args, env);

    free(args);
    return error_code;
}

Solution

  • You could implement your own system as

    switch(pid = fork()) {
      case 0:
        // Modify command to prepend "set -o pipefail &&" to it.
        execl("/bin/bash", "bash", "-c", command, (char *) 0);
     case -1: // handle fork error
     default:
        waitpid(pid, ...);
    }
    

    and LD_PRELOAD that system implementation into your tup process.

    If you don't feel like doing low-level process management, you can interpose system to just wrap the command in bash -c "set -o pipefail && " and escape quotes, then invoke the original system. See this article on library interposition.