In my Perl script, I would like to process lines from either STDIN
or a given file, if specified, as common with Linux/UNIX command line utilities.
To this end, I have the following section in my script (simplified for the post):
use strict;
use warnings;
my $in = \*STDIN;
open $in, '<', $ARGV[0] or die if (defined $ARGV[0]);
print while (<$in>);
Essentially, I define $in
to be a reference to the STDIN
typeglob, so normally, if no argument is specified, the script does print
for each line of <STDIN>
. So far, so good.
If $ARGV[0]
is defined however, I would like to read lines from that. That is what the second meaningful line purports to do. However, it seems that no lines are processed when ran with an argument.
I noticed that after my conditional call to open
, $in
does not change, even when I expect it to;
my $in = \*STDIN;
print $in, "\n";
open $in, '<', $ARGV[0] or die if (defined $ARGV[0]);
print $in, "\n";
yields
GLOB(0xaa08b2f4f28)
GLOB(0xaa08b2f4f28)
even when $ARGV[0]
is defined. Does open
not work when the first variable passed is already referring to a filehandle?
The relevant documentation does include the following
About filehandles
The first argument to open, labeled FILEHANDLE in this reference, is usually a scalar variable. (Exceptions exist, described in "Other considerations", below.) If the call to open succeeds, then the expression provided as FILEHANDLE will get assigned an open filehandle. That filehandle provides an internal reference to the specified external file, conveniently stored in a Perl variable, and ready for I/O operations such as reading and writing.
Based on this alone, I do not see why my code would not work.
That's precisely what the null filehandle <> does
Input from
<>
comes either from standard input, or from each file listed on the command line.
So all you need is
while (<>) {
...
}
(see the rest of what docs say about it)
Another, in some cases safer option, is to use a double diamond bracket
while (<<>>) { }
Using double angle brackets inside of a while causes the open to use the three argument form (with the second argument being
<
), so all arguments inARGV
are treated as literal filenames (including "-"). (Note that for convenience, if you use<<>>
and if@ARGV
is empty, it will still read from the standard input.)
(again, please see the rest of what docs say)
For the second part of the question, and following a discussion in comments, it is worth noting that my $in = \*STDIN
creates an alias to STDIN
(not a copy); see this post. Then open-ing a file with such scalar (that had previously been assigned a reference to a typeglob) as filehandle merely redirects the original typeglob. So here once we open
the $in
filehandle then STDIN
winds up connected to that file.
This is easily checked
perl -wE'
$in = \*STDIN;
say "\$in: $$in"; #--> *main::STDIN
print while <$in>; # type input, then Ctrl-D
open $in, "<", $ARGV[0] or die $!;
say "\$in is: $$in"; #--> *main::STDIN
print while <$in>; # but prints the file
seek $in, 0, 0;
print while <STDIN>; # prints the file
' file
After we type in some input, which is printed back, and Ctrl-D, after open
-ing the file the filehandle is shown to still be STDIN
but it does print out that file. Then printing STDIN
still prints the file.
The STDIN
has been reconnected by open
to the file; getting it back isn't simple. So if one is to actually associate STDIN
with a lexical then better dupe it. See docs and the linked post.
As for the direct question -- yes, one can reassign a filehandle by open
-ing it.
But the ... or die if ...
syntax is wrong as one cannot chain conditionals like that.
However, I cannot reproduce the shown behavior as your code actually works for me (on 5.16 and 5.30 on Linux). My best guess then is that such code results in an "undefined behavior" and we get unpredictable and inconsistent behaviors.
Consider
E1 or E2 if E3;
where E
s stand for Expressions. (This is for open(...) or die($!) if COND;
)
What should if E3
apply to -- the lone E2
or the whole E1 or E2
? There is no way to tell and what one may well get then is the dreaded "undefined behavior" (UB) -- it may actually work, sometimes/under some conditions/on some systems, or anything else may happen.
Now, there may be a little more to it: E2 if E3
cannot be a part of a condition so the interpretation of it all as E1 or (E2 if E3);
is directly illegal syntax so perhaps in my program the statement is interpreted as
(E1 or E2) if E3;
which is fine (and works as intended, as it happens). However, the original statement still must be UB and on OP's system it doesn't work.
Thus if you do need to have a filehandle at a minimum can fix that by adding parenthesis
(open $in, '<', $ARGV[0] or die $!) if defined $ARGV[0];
But I'd recommend writing a nice and readable test instead of cramming it into one statement (and dup-ing STDIN
to start with).