Learning Perl

Learning PerlSearch this book
Previous: 12.1 Moving Around the Directory TreeChapter 12
Directory Access
Next: 12.3 Directory Handles
 

12.2 Globbing

The shell (or whatever your command-line interpreter is) takes a solitary asterisk (*) command-line argument and turns it into a list of all of the filenames in the current directory. So, when you say rm *, you'll remove all of the files from the current directory. (Don't try this unless you like irritating your system administrator when you request the files to be restored.) Similarly, [a-m]*.c as a command-line argument turns into a list of all filenames in the current directory that begin with a letter in the first half of the alphabet and end in .c, and /etc/host* is a list of all filenames that begin with host in the directory /etc. (If this is new to you, you probably want to read some more about shell scripting somewhere else before proceeding.)

The expansion of arguments like * or /etc/host* into the list of matching filenames is called globbing. Perl supports globbing through a very simple mechanism: just put the globbing pattern between angle brackets or use the more mnemonically named glob function.

@a = </etc/host*>;
@a = glob("/etc/host*");

In a list context, as demonstrated here, the glob returns a list of all names that match the pattern (as if the shell had expanded the glob arguments) or an empty list if none match. In a scalar context, the next name that matches is returned, or undef is returned if there are no more matches; this is very similar to reading from a filehandle. For example, to look at one name at a time:

while (defined($nextname = </etc/host*>)) {
    print "one of the files is $nextname\n";
}

Here the returned filenames begin with /etc/host, so if you want just the last part of the name, you'll have to whittle it down yourself, like so:

while ($nextname = </etc/host*>) {
    $nextname =~ s#.*/##; # remove part before last slash
    print "one of the files is $nextname\n";
}

Multiple patterns are permitted inside the file glob argument; the lists are constructed separately and then concatenated as if they were one big list:

@fred_barney_files = <fred* barney*>;

In other words, the glob returns the same values that an equivalent echo command with the same parameters would return.[3]

[3] This is actually no surprise when you understand that to perform the glob, Perl merely fires off a C-shell to glob the specified arglist and parses what it gets back.

Although file globbing and regular-expression matching function similarly, the meaning of the various special characters is quite different. Don't confuse the two, or you'll be wondering why <\.c$> doesn't find all of the files that end in .c !

The argument to glob is variable interpolated before expansion. You can use Perl variables to select files based on a string computed at run-time:

if (-d "/usr/etc") {
    $where = "/usr/etc";
} else {
    $where = "/etc";
}
@files = <$where/*>;

Here we set $where to be one of two different directory names, based on whether or not the directory /usr/etc exists. We then get a list of files in the selected directory. Note that the $where variable is expanded, which means the wildcard to be globbed is either /etc/* or /usr/etc/*.

There's one exception to this rule: the pattern <$var> (meaning to use the variable $var as the entire glob expression) must be written as <${var}> for reasons we'd rather not get into at this point.[4]

[4] The construct <$fred> reads a line from the filehandle named by the contents of the scalar variable $fred. Together with some other features not covered in this book, this construct enables you to use "indirect filehandles" where the name of a handle is passed around and manipulated as if it were data.


Previous: 12.1 Moving Around the Directory TreeLearning PerlNext: 12.3 Directory Handles
12.1 Moving Around the Directory TreeBook Index12.3 Directory Handles