Advanced Perl Programming

Advanced Perl ProgrammingSearch this book
Previous: 3.3 Typeglobs and ReferencesChapter 3
Typeglobs and Symbol Tables
Next: 4. Subroutine References and Closures
 

3.4 Filehandles, Directory Handles, and Formats

The built-in functions open and opendir initialize a filehandle and a directory handle, respectively:

open(F, "/home/calvin");
opendir (D, "/usr");

The symbols F and D are user-defined identifiers, but without a prefix symbol. Unfortunately, these handles don't have some basic facilities enjoyed by the important data types such as scalars, arrays, and hashes - you cannot assign handles, and you cannot create local handles:[4]

[4] I don't know why filehandles didn't get a standard prefix symbol and the other features enjoyed by the other data types.

local (G);   # invalid 
G = F;       # also invalid

Before we go further, it is important to know that the standard Perl distribution comes with a module called FileHandle that provides an object-oriented version of filehandles. This allows you to create filehandle "objects," to assign one to the other, and to create them local to the block. Similarly, directory handles are handled by DirHandle. Developers are now encouraged to use these facilities instead of the techniques described next. But you still need to wade through the next discussion because there is a large amount of freeware code in which you will see these constructs; in fact, the standard modules FileHandle, DirHandle, and Symbol, as well as the entire IO hierarchy of modules, are built on this foundation.

Why is it so important to be able to assign handles and create local filehandles? Without assignment, you cannot pass filehandles as parameters to subroutines or maintain them in data structures. Without local filehandles, you cannot create recursive subroutines that open files (for processing included files, which themselves might include more, for example).

The simple answer to this solution is to use typeglob assignment. That is, if you feel the urge to say,

G = F;
# or,
local(F);

you can write it instead in terms of typeglobs:

*G = *F;
# or, 
local (*F);

Similarly, if you want to store filehandles in data structures or create references to them, you use the corresponding typeglob. All I/O operators that require filehandles also accept typeglob references. Let us take a look at what we can do with assigning filehandles and localizing them (using typeglobs, of course).

3.4.1 I/O Redirection

The following example shows how I/O can be simply redirected:

open(F, '>/tmp/x') || die;
*STDOUT = *F;
print "hello world\n";

The print function thinks it is sending the output to STDOUT but ends up sending it to the open file instead, because the typeglob associated with STDOUT has been aliased to F. If you want this redirection to be temporary, you can localize *STDOUT.

3.4.2 Passing Filehandles to Subroutines

The following piece of code passes a filehandle to a subroutine:

open (F, "/tmp/sesame") || die $!;
read_and_print(*F);
sub read_and_print {
    local (*G) = @_;  # Filehandle G is the same as filehandle F
    while (<G>) { print; }
}

You might wonder why you don't need to do the same with open; after all it is a subroutine too and takes a filehandle as a parameter. Well, for built-in functions such as open, read, write, and readdir, Perl automatically passes the typeglob of that symbol (instead of a string called "F", for example).

3.4.3 Localizing Filehandles

Let us look at a subroutine that traverses include declarations in C header files. The subroutine shown next, ProcessFile, looks at each line of a file and, if it matches a #include declaration, extracts the filename and calls itself recursively. Since it has more lines to process in the original file, it cannot close the filehandle F. If F is global, it cannot be reused to open another file, so we use local(*F) to localize it. That way, each recursive invocation of ProcessFile gets its own unique filehandle value.

sub ProcessFile {
    my ($filename) = @_;
    my ($line);
    local (*F);           # Save old value of typeglob, (which means
                          # its filehandles, among other things)
    open (F, $filename) || return; 
    while ($line = <F>) {
      # same as before
      ........
    }
    close(F);
}

Although we have not studied packages, it might be worth it to see how we could have used the FileHandle module in this case:

use FileHandle;
sub ProcessFile {
    my ($filename) = @_;
    my ($line);
    my $fh = new FileHandle; # Create local filehandle
    open ($fh, $filename) || return; 
    while ($line = <$fh>) {
      ........
    }
    close($fh);
}

3.4.4 Strings as Handles

It so happens that typeglobs and objects of the FileHandle module are not the only solution. All Perl I/O functions that accept a handle also happen to accept a string instead. Consider

$fh = "foo";
open ($fh, "< /home/snoopy") ;
read ($fh, $buf, 1000);

When open examines its parameters, it finds a string where a typeglob should have been. In this case, it automatically creates a typeglob of that name and then proceeds as before. Similarly, when read gets a string instead of a typeglob, it looks up the corresponding typeglob from the symbol table, and then the internal filehandle, and proceeds to read the appropriate file. This extra lookup is slightly slower than using a bareword symbol, but the time taken is insignificant if you do the I/O in reasonably large chunks (the optimal size varies from system to system).


Previous: 3.3 Typeglobs and ReferencesAdvanced Perl ProgrammingNext: 4. Subroutine References and Closures
3.3 Typeglobs and ReferencesBook Index4. Subroutine References and Closures

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.