If you readline()
from sys.stdin
, passing the rest of it to a subprocess does not seem to work.
import subprocess
import sys
header = sys.stdin.buffer.readline()
print(header)
subprocess.run(['nl'], check=True)
(I'm using sys.stdin.buffer
to avoid any encoding issues; this handle returns the raw bytes.)
This runs, but I don't get any output from the subprocess;
bash$ printf '%s\n' foo bar baz | python demo1.py
b'foo\n'
If I take out the readline
etc, the subprocess reads standard input and produces the output I expect.
bash$ printf '%s\n' foo bar baz |
> python -c 'import subprocess; subprocess.run(["nl"], check=True)'
1 foo
2 bar
3 baz
Is Python buffering the rest of stdin when I start reading it, or what's going on here? Running with python -u
does not remove the problem (and indeed, the documentation for it only mentions that it changes the behavior for stdout
and stderr
). But if I pass in a larger amount of data, I do get some of it:
bash$ wc -l /etc/services
13921 /etc/services
bash$ python demo1.py </etc/services | head -n 3
1 27/tcp # NSW User System FE
2 # Robert Thomas <BThomas@F.BBN.COM>
3 # 28/tcp Unassigned
(... traceback from broken pipe elided ...)
bash$ fgrep -n 'NSW User System FE' /etc/services
91:nsw-fe 27/udp # NSW User System FE
92:nsw-fe 27/tcp # NSW User System FE
bash$ sed -n '1,/NSW User System FE/p' /etc/services | wc
91 449 4082
(So, looks like it eats 4096 bytes from the beginning.)
Is there a way I can avoid this behavior, though? I would like to only read one line off from the beginning, and pass the rest to the subprocess.
Calling sys.stdin.buffer.readline(-1)
repeatedly in a loop does not help.
This is actually a follow-up for Read line from shell pipe, pass to exec, and keep to variable but I wanted to focus on this, to me, surprising aspect of the problem in that question.
This is because sys.stdin
is created using the built-in open
function in the default buffered mode, which uses a buffer of size io.DEFAULT_BUFFER_SIZE
, which on most systems is either 4096
or 8192
bytes.
To make the parent process consume precisely one line of text from the standard input, you can therefore open it with the buffer disabled by passing 0
as the buffering
argument to the open
or os.fdopen
function:
# subp1.py
import os
import sys
import subprocess
# or with the platform-dependent device file:
# unbuffered_stdin = open('/dev/stdin', 'rb', buffering=0)
unbuffered_stdin = os.fdopen(sys.stdin.fileno(), 'rb', buffering=0)
print(unbuffered_stdin.readline())
subprocess.run(['nl'], check=True)
so that:
printf "foo\nbar\n" | python subp1.py
would then output:
b'foo\n'
1 bar