So, this sounds like it "should" be possible. But it takes some explaining. Let me start from something simple. Let's say I'm on some vaguely POSIX-like system (maybe GNU/Linux, maybe Cygwin, whatever). I want a shell script that runs two commands and displays the output. Easy:
#!/bin/bash
grep needle 1st-haystack.txt
grep needle 2nd-haystack.txt
Okay, now it turns out that this takes too long... let's try to run them in parallel as background jobs:
#!/bin/bash
grep needle 1st-haystack.txt > tmp1.txt &
grep needle 2nd-haystack.txt > tmp2.txt &
wait
cat tmp1.txt tmp2.txt
rm tmp1.txt tmp2.txt
A bit more complicated, but not bad. But what if the command I want to run isn't grep
?
What if, hypothetically, it's some command that can usually run with no interaction but once in a while needs to prompt the user for input?
In that case, when it tries to read from stdin
while in the background, the terminal driver will notice that it's trying to read from the controlling terminal and send it a signal which will suspend it by default. The shell will notice this and wait
will return, and then the script can try to do something sensible about it. For instance, something like this:
#!/bin/bash
# example below is suggestive and doesn't actually work else this would get too long
sleep 5 && cat > tmp1.txt & # cat is our stand-in for a process that might need input
sleep 8 && cat > tmp2.txt & # cat is our stand-in for a process that might need input
wait # wait could return because we're "done" or because one of them needed input
cat tmp1.txt # so we figure out which case applies, and if one of them is stuck, then we cat
# what it wrote so far, so that the user has enough context to supply the input it needs.
fg # then we bring it to the foreground, and don't try any more parallel tricks
# cleanup code and other stuff is omitted for brevity
So far so good! Now let's try this over SSH. First, the simple version:
#!/bin/bash
ssh box1 grep needle 1st-haystack.txt
ssh box2 grep needle 2nd-haystack.txt
This works of course. Now in parallel:
#!/bin/bash
ssh box1 grep needle 1st-haystack.txt > tmp1.txt &
ssh box2 grep needle 2nd-haystack.txt > tmp2.txt &
wait
cat tmp1.txt tmp2.txt
rm tmp1.txt tmp2.txt
Still no problem! Now let's bring back the complication of maybe needing input:
#!/bin/bash
# example below is suggestive and doesn't actually work else this would get too long
ssh box1 'sleep 5 && cat' > tmp1.txt &
ssh box2 'sleep 8 && cat' > tmp2.txt &
# uh-oh, now what?
It turns out that ssh
has no way of knowing whether the remote process is trying to read input or not. Thus, ssh needs to be ready to accept local input at all times. But it can't call read() unconditionally, because then it would get suspended and the grep
example above would not work. So instead it poll()s to see if input is available from the local terminal, and if so THEN it read()s it. When ssh is forced to run in the background, poll() never tells it that input is ready, so it never tries to read() it, so it never gets auto-suspended. Thus the shell never gets any hints that the background job is stuck and will never get unstuck.
My question is, how much of the tech stack would one have to rewrite in order to get something like this to work? Naively, it seems that this requires surgery across a lot of levels.
ssh
with -t
then cat would have a controlling (pseudo-)terminal. In that case, the kernel would know that sshd
holds the master end of the pseudo-terminal. Since we did not use -t
, the standard input for cat is likely an ordinary pipe. The kernel still knows that sshd
holds the other end of that pipe.sshd
has no standardized to report this info to the SSH client.(And then we need to fix the weird issue where the terminal starts acting weird if I try to start ssh
in the background and then bring it to the foreground. That's likely just a mundane bug.)
(Oh, and also, at some point it would be nice to support non-POSIX ssh clients. For instance, there's a nice one called plink
which is part of the PuTTY suite. Unlike PuTTY itself, it's a command-line tool with no GUI. But unlike OpenSSH, it doesn't have work in terms of POSIX concepts like signals and terminals, since it's a native windows app. But this is hard enough so let's ignore non-POSIX for now.)
My question is: is there an EASIER way to get this simple scenario to work, without the "rewrite the world" plan above?
(Continuing from comments of the first answer)
I don't think modifications on kernel level are necessary here, neither in sshd. A custom remote script-driver will certainly be needed. Rolling a custom SSH client, while not strictly necessary (AFAICS now), might help discover exactly what degrees of control you have on the client-side, and by virtue of that, drive off some confusion; how many bash processes are we talking about again?.. With stock ssh -t
, we certainly have 2 PTY master/slave pairs, per session:
│
Local host │ Remote host
│
term emulator /dev/ptmx ┌─►sshd───────────► /dev/ptmx
│ │ │
└► shell─────► /dev/ptsN │ └─►login shell─► /dev/ptsM
│ ▲ │ │ ▲
└► ssh client────┘ │ └►script-driver─────┤
│ │ │ │
│ network tunnel │ └►bash ───────────┤
└────────────────────┘ │ │
└►grep ─────────┘
With 2 ssh -t
sessions — there's 3 master/slave PTY pairs — on 3 hosts.
See, the picture is getting complicated fast; and we haven't even got into the sessions & process-groups shenanigans yet. That's why I'm saying, a custom ssh client might give a more clear interface to the problem.
Now, consider that you can write your own script-driver
, in native code (I don't think bash will cut it) — and it can take place of the login shell
on remote side (i.e. ssh remote-host -- script-driver payload-script
).
What the script-driver can do, is:
As the session-leader for payload-script, the script-driver can arrange for SIGTTIN
to be sent to payload-script
whenever it reads stdin, together with SIGCHLD
sent to script-driver
. Here is where you catch your wanted condition.
And since presumably script-driver
is written by you — you're free to decide how to deal with it. With a custom ssh-client acting as multiplexer, there's the nice option to report the condition from script-driver
up to the multiplexer, by messaging over a dedicated channel in the tunnel perhaps (to avoid interfering with stdout/stderr streams). If the multiplexer custom-client is a GUI, you then handle it there, as you best see fit.
Such a custom script-driver might very well also work with stock ssh client from OpenSSH. You'll just need to invent unambiguous signalling, all the way from script-driver back to the ssh-caller script (the shell
on Local host
side in the diagram) — through the ssh client. And be really careful to avoid getting confused which messages travel where.
I see yet even more advanced technique, so-called "packet mode", TIOCPKT (referenced from ioctl_tty). There, again in native code only, as a PTY master, you can arrange to receive TIOCPKT_FLUSHREAD
message whenever the (pseudo)-terminal slave's read queue empties. This means, you can place a sentinel character (a space, maybe, or a null char) in slave's stdin beforehand, keeping the queue non-empty; then as soon as it gets a read()
from it, you'll get the TIOCPKT_FLUSHREAD
message, in packet mode.
But this feels more arcane than necessary. The classic SIGTTIN
+ SIGCHLD
approach seems simpler.