linuxpipeqsubgrid-computingsungridengine

How can I use a pipe or redirect in a qsub command?


There are some commands I'd like to run on a grid using qsub (SGE 8.1.3, CentOS 5.9) that need to use a pipe (|) or a redirect (>). For example, let's say I have to parallelize the command

echo 'hello world' > hello.txt

(Obviously a simplified example: in reality I might need to redirect the output of a program like bowtie directly to samtools). If I did:

qsub echo 'hello world' > hello.txt

the resulting content of hello.txt would look like

Your job 123454321 ("echo") has been submitted

Similarly if I used a pipe (echo "hello world" | myprogram), that message is all that would be passed to myprogram, not the actual stdout.

I'm aware I could write a small bash script that each contain the command with the pipe/redirect, and then do qsub ./myscript.sh. However, I'm trying to run many parallelized jobs at the same time using a script, so I'd have to write many such bash scripts each with a slightly different command. When scripting this solution can start to feel very hackish. An example of such a script in Python:

for i, (infile1, infile2, outfile) in enumerate(files):
    command = ("bowtie -S %s %s | " +
               "samtools view -bS - > %s\n") % (infile1, infile2, outfile)

    script = "job" + str(counter) + ".sh"
    open(script, "w").write(command)
    os.system("chmod 755 %s" % script)
    os.system("qsub -cwd ./%s" % script)

This is frustrating for a few reasons, among them that my program can't even delete the many jobXX.sh scripts afterwards to clean up after itself, since I don't know how long the job will be waiting in the queue, and the script has to be there when the job starts.

Is there a way to provide my full echo 'hello world' > hello.txt command to qsub without having to create another file containing the command?


Solution

  • You can do this by turning it into a bash -c command, which lets you put the | in a quoted statement:

     qsub bash -c "cmd <options> | cmd2 <options>"
    

    As @spuder has noted in the comments, it seems that in other versions of qsub (not SGE 8.1.3, which I'm using), one can solve the problem with:

    echo "cmd <options> | cmd2 <options>" | qsub
    

    as well.