xargsgnu-parallel

Switching from xargs to parallel solely to get --joblog output, all else the same. How?"


I need to grab the exit status from a command that I run via xargs. Since that's not possible it's recommended to use parallel. Some documentation suggests that it's often as simple as swapping the usage of xargs with parallel. My usage of xargs is so simple that I would think this applies to me.

To restate my post-title, the ONLY reason I want to use parallel is to be able to parse the file created with --joblog. In other words, I don't actually want to run anything in parallel.

I am failing to figure out how to do what I would think would be a simple task. Google-searches have struck out for me finding an answer. I'm hoping someone here can put me to shame and show me how simple this is.

Here's some boiled-down test code to demonstrate the issue I'm bumping up against.

#!/bin/bash

tmpDir=/tmp/tmpTest.$$
tmpArgs=/tmp/tmpArgs.$$
tmpJobLog=/tmp/tmpJobLog.$$

mkdir $tmpDir
cd $tmpDir
touch aaa bbb 'c c' 'd d'

cat << EOF > $tmpArgs
--create
--file
test.tar
aaa
bbb
c c
d d
EOF

# I'm only using 'tar' for this simple test, it doesn't need to be 'tar',
# but it demonstrates my problem easily when comparing xargs vs parallel usage.
#
tr '\n' '\0' < $tmpArgs | xargs -0 tar

tar -t -f test.tar
/bin/rm test.tar

tr '\n' '\0' < $tmpArgs | parallel --joblog $tmpJobLog -0 tar

cat $tmpJobLog

/bin/rm -r /tmp/tmpTest.$$
/bin/rm $tmpArgs
/bin/rm $tmpJobLog

The xargs line works as I'd expect, and the tar -t -f test.tar prints out what you'd expect. That is:

aaa
bbb
c c
d d

However my simple minded substitution of parallel for xargs seems to want to run all the arguments to tar in parallel! Creating a host of errors (of course):

tar: Cowardly refusing to create an empty archive
Try 'tar --help' or 'tar --usage' for more information.
tar: option '--file' requires an argument
Try 'tar --help' or 'tar --usage' for more information.
tar: invalid option -- 'e'
<SNIP>
tar: invalid option -- ' '
Try 'tar --help' or 'tar --usage' for more information.

Examining the joblog file it's clear what's going on:

Seq Host    Starttime   JobRuntime  Send    Receive Exitval Signal  Command
1   :   1746839006.000       0.000  0   0   2   0   tar --create
2   :   1746839006.000       0.000  0   0   64  0   tar --file
3   :   1746839006.000       0.000  0   0   64  0   tar test.tar
4   :   1746839006.000       0.000  0   0   2   0   tar aaa
5   :   1746839006.000       0.000  0   0   2   0   tar bbb
6   :   1746839006.000       0.000  0   0   64  0   tar 'c c'
7   :   1746839006.000       0.000  0   0   64  0   tar 'd d'

So that was a long winded way of asking, how do I simply get parallel to behave like xargs?

Thanks kindly!


Solution

  • As you don't show the arguments being chunked/grouped in any way, it seems you wish to have xargs run a single command that consumes all the arguments.

    If that is the case, simply running the command without wrapping it with xargs will provide direct access to its exit code. For example:

    mapfile -t <"$tmpArgs"
    tar "${MAPFILE[@]}"
    echo "exit code = $?"
    

    Or even:

    ( set -f; IFS=$'\n'; tar $(<"$tmpArgs") )
    

    Or for any POSIX shell:

    ( set -f; IFS='
    '; tar $(cat "$tmpArgs") )
    

    The absence of quoting around $(...) activates word-splitting using IFS (but set -f prevents any pathname expansion). The sub-shell context makes the noglob and IFS changes temporary, and sets $? to the exit value of the final command within.


    To use parallel, the documentation says:

    With --xargs GNU parallel will fit as many arguments as possible on a single line:

    So:

    tr '\n' '\0' "$tmpArgs" | parallel -0 --joblog "$tmpJobLog" --xargs tar