The shell (or whatever your command-line interpreter is) takes a solitary asterisk (*
) command-line argument and turns it into a list of all of the filenames in the current directory. So, when you say rm
*
, you'll remove all of the files from the current directory. (Don't try this unless you like irritating your system administrator when you request the files to be restored.) Similarly, [a-m]*.c
as a command-line argument turns into a list of all filenames in the current directory that begin with a letter in the first half of the alphabet and end in .c, and /etc/host*
is a list of all filenames that begin with host in the directory /etc. (If this is new to you, you probably want to read some more about shell scripting somewhere else before proceeding.)
The expansion of arguments like *
or /etc/host*
into the list of matching filenames is called globbing. Perl supports globbing through a very simple mechanism: just put the globbing pattern between angle brackets or use the more mnemonically named glob
function.
@a = </etc/host*>; @a = glob("/etc/host*");
In a list context, as demonstrated here, the glob returns a list of all names that match the pattern (as if the shell had expanded the glob arguments) or an empty list if none match. In a scalar context, the next name that matches is returned, or undef
is returned if there are no more matches; this is very similar to reading from a filehandle. For example, to look at one name at a time:
while (defined($nextname = </etc/host*>)) { print "one of the files is $nextname\n"; }
Here the returned filenames begin with /etc/host, so if you want just the last part of the name, you'll have to whittle it down yourself, like so:
while ($nextname = </etc/host*>) { $nextname =~ s#.*/##; # remove part before last slash print "one of the files is $nextname\n"; }
Multiple patterns are permitted inside the file glob argument; the lists are constructed separately and then concatenated as if they were one big list:
@fred_barney_files = <fred* barney*>;
In other words, the glob returns the same values that an equivalent echo command with the same parameters would return.[3]
[3] This is actually no surprise when you understand that to perform the glob, Perl merely fires off a C-shell to glob the specified arglist and parses what it gets back.
Although file globbing and regular-expression matching function similarly, the meaning of the various special characters is quite different. Don't confuse the two, or you'll be wondering why <\.c$>
doesn't find all of the files that end in .c !
The argument to glob
is variable interpolated before expansion. You can use Perl variables to select files based on a string computed at run-time:
if (-d "/usr/etc") { $where = "/usr/etc"; } else { $where = "/etc"; } @files = <$where/*>;
Here we set $where
to be one of two different directory names, based on whether or not the directory /usr/etc exists. We then get a list of files in the selected directory. Note that the $where
variable is expanded, which means the wildcard to be globbed is either /etc/*
or /usr/etc/*
.
There's one exception to this rule: the pattern <$var>
(meaning to use the variable $var
as the entire glob expression) must be written as <${var}>
for reasons we'd rather not get into at this point.[4]
[4] The construct
<$fred>
reads a line from the filehandle named by the contents of the scalar variable$fred
. Together with some other features not covered in this book, this construct enables you to use "indirect filehandles" where the name of a handle is passed around and manipulated as if it were data.