openFILEHANDLE
,EXPR
openFILEHANDLE
This function opens the file whose filename is given by EXPR
, and
associates it with FILEHANDLE
. If EXPR
is omitted, the
scalar variable of the same name as the FILEHANDLE
must contain the
filename. (And you must also be careful to use "or die
" after the
statement rather than "|| die
", because the precedence of ||
is higher than list operators like open.)
FILEHANDLE
may be a directly specified filehandle name, or an
expression whose value will be used for the filehandle. The latter is
called an indirect filehandle. If you supply an undefined variable for the indirect
filehandle, Perl will not automatically fill it in for you - you
have to make sure the expression returns something unique, either
a string specifying the actual filehandle name, or a filehandle
object from one of the object-oriented I/O packages. (A filehandle
object is unique because you call a constructor to generate the object.
See the example later in this section.)
After the filehandle is determined, the filename string is processed. First,
any leading and trailing whitespace is removed from the string.
Then the string is examined on both ends for characters specifying how
the file is to be opened. (By an amazing coincidence, these characters
look just like the characters you'd use to indicate I/O redirection to the
Bourne shell.) If the filename begins with <
or nothing, the file
is opened for input. If the filename begins with >
, the file
is truncated and opened for output. If the filename begins with >>
, the
file is opened for appending.
(You can also put a +
in front of the >
or <
to
indicate that you want both read and write access to the file.)
If the filename begins with |
, the filename is interpreted as
a command to which output is to be piped, and if the filename ends
with a |
, the filename is interpreted as command which pipes
input to us.
You may not have an open command that pipes both
in and out, although
the IPC::Open2 and IPC::Open3 library routines give you a close
equivalent. See the section "Bidirectional Communication" in Chapter 6.
Any pipe command containing shell metacharacters is passed to
/bin/sh for execution; otherwise it is executed directly by
Perl. The filename "-
" refers to
STDIN
, and ">-
" refers to
STDOUT
. open returns
non-zero upon success, the undefined value otherwise. If the open involved a pipe, the return value happens to be the
process ID of the subprocess.
If you're unfortunate enough to be running Perl on a system that
distinguishes between text files and binary files (modern operating
systems don't care), then you should check out binmode for tips
for dealing with this. The key distinction between systems that need
binmode and those that don't is their text file formats.
Systems like UNIX and Plan9 that delimit lines with a single
character, and that encode that character in C as '\n'
, do
not need binmode. The rest need it.
Here is some code that shows the relatedness of a filehandle and a variable of the same name:
$ARTICLE = "/usr/spool/news/comp/lang/perl/misc/38245"; open ARTICLE or die "Can't find article $ARTICLE: $!\n"; while (<ARTICLE>) {...
Append to a file like this:
open LOG, '>>/usr/spool/news/twitlog'; # (`log' is reserved)
Pipe your data from a process:
open ARTICLE, "caesar <$article |"; # decrypt article with rot13
Here <
does not indicate that Perl should open the file for input,
because <
is not the first character of EXPR
. Rather, the concluding
|
indicates that input is to be piped from caesar <$article
(from
the program caesar, which takes $article as its standard
input).
The <
is interpreted by the subshell that Perl uses to start
the pipe, because <
is a shell metacharacter.
Or pipe your data to a process:
open EXTRACT, "|sort >/tmp/Tmp$$"; # $$ is our process number
In this next example we show one way to do recursive opens, via
indirect filehandles. The files will be opened on filehandles
fh01
, fh02
, fh03
, and so on. Because $input
is
a local variable, it is preserved through recursion, allowing us to
close the correct file before we return.
# Process argument list of files along with any includes. foreach $file (@ARGV) { process($file, 'fh00'); } sub process { local($filename, $input) = @_; $input++; # this is a string increment unless (open $input, $filename) { print STDERR "Can't open $filename: $!\n"; return; } while (<$input>) { # note the use of indirection if (/^#include "(.*)"/) { process($1, $input); next; } ... # whatever } close $input; }
You may also, in the Bourne shell tradition, specify an EXPR
beginning
with >&
, in which case the rest of the string is interpreted
as the name of a filehandle (or file descriptor, if numeric) which is
to be duped and opened.[6]
You may use &
after >
, >>
, <
, +>
,
+>>
, and +<
. The mode you specify should match the mode
of the original filehandle. Here is a script that saves, redirects,
and restores STDOUT
and STDERR
:
[6] The word "dup" is UNIX-speak for "duplicate". We're not really trying to dupe you. Trust us.
#!/usr/bin/perl open SAVEOUT, ">&STDOUT"; open SAVEERR, ">&STDERR"; open STDOUT, ">foo.out" or die "Can't redirect stdout"; open STDERR, ">&STDOUT" or die "Can't dup stdout"; select STDERR; $| = 1; # make unbuffered select STDOUT; $| = 1; # make unbuffered print STDOUT "stdout 1\n"; # this propagates to print STDERR "stderr 1\n"; # subprocesses too close STDOUT; close STDERR; open STDOUT, ">&SAVEOUT"; open STDERR, ">&SAVEERR"; print STDOUT "stdout 2\n"; print STDERR "stderr 2\n";
If you specify <&=
N
, where
N
is a number, then Perl will do an equivalent of C's
fdopen(3) of that file descriptor; this is more
parsimonious with file descriptors than the dup form described earlier. (On the
other hand, it's more dangerous, since two filehandles may now be sharing the
same file descriptor, and a close on one filehandle may prematurely close the
other.) For example:
open FILEHANDLE
, "<&=$fd";
If you open a
pipe to or from the command "-
" (that is, either
|-
or -|
), then an implicit fork is done,
and the return value of open is the pid of the
child within the parent process, and 0
within the child
process. (Use defined($pid)
in either the parent or child to
determine whether the open was successful.) The
filehandle behaves normally for the parent, but input and output to that
filehandle is piped from or to the STDOUT
or
STDIN
of the child process. In the child process the
filehandle isn't opened - I/O happens from or to the new
STDIN
or STDOUT
. Typically this is used
like the normal piped open when you want to
exercise more control over just how the pipe command gets executed, such as when
you are running setuid, and don't want to have to scan shell commands for
metacharacters. The following pairs are equivalent:
open FOO, "|tr '[a-z]' '[A-Z]'"; open FOO, "|-" or exec 'tr', '[a-z]', '[A-Z]'; open FOO, "cat -n file|"; open FOO, "-|" or exec 'cat', '-n', 'file';
Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. On any operation which may do a fork, unflushed buffers remain unflushed in both processes, which means you may need to set $| on one or more filehandles to avoid duplicate output (and then do output to flush them).
Filehandles STDIN
, STDOUT
, and STDERR
remain open
following an exec. Other filehandles do not. (However, on systems
supporting the fcntl function, you may modify the
close-on-exec flag for a filehandle. See fcntl earlier in
this chapter. See also the special $^F variable.)
Using the constructor from the FileHandle module, described in Chapter 7, you can generate anonymous filehandles which have the scope of whatever variables hold references to them, and automatically close whenever and however you leave that scope:
use FileHandle; ... sub read_myfile_munged { my $ALL = shift; my $handle = new FileHandle; open $handle, "myfile" or die "myfile: $!"; $first = <$handle> or return (); # Automatically closed here. mung $first or die "mung failed"; # Or here. return $first, <$handle> if $ALL; # Or here. $first; # Or here. }
In order to open a file with arbitrary weird characters in it, it's necessary to protect any leading and trailing whitespace, like this:
$file =~ s#^(\s)#./$1#; open (FOO, "< $file\0");
But we've never actually seen anyone use that in a script...
If you want a real C open(2), then you should use the sysopen function. This is another way to protect your filenames from interpretation. For example:
use FileHandle; sysopen HANDLE, $path, O_RDWR|O_CREAT|O_EXCL, 0700 or die "sysopen $path: $!"; HANDLE->autoflush(1); HANDLE->print("stuff $$\n"); seek HANDLE, 0, 0; print "File contains: ", <HANDLE>;
See seek for some details about mixing reading and writing.