gnu-parallel

How to use gnu parallel to run the same command on multiple servers using sshpass?


I figured out a satisfactory solution so far, but wondering if there's a way that uses more of parallel's designed options.

I'm running parallel --version 20240422.

I have read man parallel and https://www.gnu.org/software/parallel/parallel_tutorial.html and the book but none of the examples match my situation which specifically requires

  1. password authentication (ideally with sshpass), not public/private key
  2. allows saving a list of servers in a file (so it's easily editable)
  3. servers have different ports, some default 22, some don't

The environment has export SSHPASS already.

I tried numerous methods, and am unsure is it a syntax problem or environment setup (instances (subshell?) of parallel not inheriting SSHPASS) problem.

To check command and environment correctness first without parallel, sshpass -e ssh <user>:@<ip> -p <port> works.


Quirk: I'm adding : after user as a workaround to fit gnuparallel because

I have no idea why gnu-parallel requires that :, but i add it to all experiments from here on even if it may be overspecified. Any advice on how to debug it? I have the gnu parallel perl source and VSCode Perl Language Server installed but the debugger early stops in sub parse_options() due to some language server bug. Someone else also wanted to: Is there a way to debug GNU Parallel?


Because sshpass -e ssh <user>:@<ip> -p <port> works, I expect that parallel --slf iplist echo ::: env would work too if use a file iplist to store all the servers containing different ip and ports

sshpass -e ssh <user>:@<ip1> -p <port1>
sshpass -e ssh <user>:@<ip2> -p <port2>

Each line is a sshlogin in gnu parallel vocabulary, of format [sshcommand [options]] [username[:password]@]hostname from man parallel

Output

2 lines of parallel: Warning: Could not figure out number of cpus on sshpass -e ssh <user>:@<ip> -p <port> (). Using 1., then says Permission denied, please try again.

Change parallel to env_parallel

The last error usually happens when you enter wrong or empty password in ssh. Then i thought whether SSHPASS is missing from instances started from parallel, so did env_parallel --env SSHPASS --slf iplist echo ::: env to see exactly the same error as previous command.

I did not try using -S syntax because that's impractical when you have multiple servers

Keeping it simple

I decided to stop depending on any options like --ssh, --slf, -S because i don't know what they're doing under the hood (blocked by aforementioned debugger) , and put together in a single command, as much of the required information as possible .

iplist contains

<user>:@<ip1>
<user>:@<ip2>

Now this works, partially. (if using parallel, gives 2x Permission denied, please try again.)

cat iplist | env_parallel --env SSHPASS sshpass -e "ssh {} -p <port> 'env;hostname'"

Can't change ports!

Because some servers default to 22, I did not want to hardcode in command but put it in the iplist file like <user>:@<ip2> -p <port>, to call with cat iplist | env_parallel --env SSHPASS sshpass -e "ssh {} 'env;hostname'" Now i get hostname contains invalid characters

--dry-run shows me sshpass -e ssh '<user>:@<ip> -p <port>' 'env;hostname', which gives same error when run by itself without parallel (i did this as extra check). None of other quoting attempts i did in iplist was able to resolve this error.

Looks solved, but can it be better?

Then i noticed --colsep can interpret input as table and cat iplist | env_parallel --colsep ' ' --env SSHPASS "sshpass -e ssh {1} -p {2} 'env;hostname'" > results.txt works.

ip list contains

ip1 port1
ip2 port2

This --colsep saved me from unpivoting it with awk awk '{printf "%s\n%s\n", $1, $2}' iplist | env_parallel <truncated>

ip1
port1
ip2
port2

, then use parallel's -N2 to read 2 rows at once and refer to ip, port using replacement strings {1} {2} respectively.

Still I had to hardcode 22 in iplist for ips that don't have a non-default port instead of leaving it out, otherwise it either breaks this structure, or ssh -p option will interpret whatever argument follows as port number and error. There must be a better way.

Summary

  1. I want to use the designed options for possibly syntax conveniences, but don't know their impact or interactions
  2. I want to debug the parallel perl script to learn what the options do (especially --onall that prompted another question Does gnu parallel's --onall not work with sshpass?)

Feedback

GNU parallel works great in general, especially with --dry-run and --joblog to help debugging, but for cases like this not covered by documentation, adding a guide to debugging/contributing would empower users, considering that there are so many options that interact, and so many ways to specify a server, or take input.


Solution

  • An SSHLogin like:

    user@host
    

    will run ssh -l user host.

    user:@host
    

    will run sshpass -e ssh -l user host.

    :@host
    

    will run sshpass -e ssh host.

    user:mypass@host
    

    will run SSHPASS=mypass sshpass -e ssh -l user host.

    So the reason why you need the additional ':' after user is to tell GNU Parallel to use sshpass with $SSHPASS that is set before running GNU Parallel.

    You can add :<port> to host to set the port (as you expect).

    So with these in myslf:

    user:@host
    :@host:2222
    user:pass@host:22223
    

    you can set SSHPASS and run:

    parallel --slf myslf echo ::: {1..100}
    

    But it seems you have found a bug: user:pass@host does not work for --onall.

    This will be fixed in 20240522 https://savannah.gnu.org/bugs/index.php?65679