You want to do something to each file and subdirectory in a particular directory.
Use the standard File::Find module.
use File::Find; sub process_file { # do whatever; } find(\&process_file, @DIRLIST);
File::Find provides a convenient way to process a directory recursively. It does the directory scans and recursion for you. All you do is pass find
a code reference and a list of directories. For each file in those directories, recursively, find
calls your function.
Before calling your function, find
changes to the directory being visited, whose path relative to the starting directory is stored in the $File::Find::dir
variable. $_
is set to the basename of the file being visited, and the full path of that file can be found in $File::Find::name
. Your code can set $File::Find::prune
to true to tell find
not to descend into the directory just seen.
This simple example demonstrates File::Find. We give find
an anonymous subroutine that prints the name of each file visited and adds a /
to the names of directories:
@ARGV = qw(.) unless @ARGV; use File::Find; find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
This prints a /
after directory names using the -d file test operator, which returns the empty string ''
if it fails.
The following program prints the sum of everything in a directory. It gives find
an anonymous subroutine to keep a running sum of the sizes of each file it visits. That includes all inode types, including the sizes of directories and symbolic links, not just regular files. Once the find
function returns, the accumulated sum is displayed.
use File::Find; @ARGV = ('.') unless @ARGV; my $sum = 0; find sub { $sum += -s }, @ARGV; print "@ARGV contains $sum bytes\n";
This code finds the largest single file within a set of directories:
use File::Find; @ARGV = ('.') unless @ARGV; my ($saved_size, $saved_name) = (-1, ''); sub biggest { return unless -f && -s _ > $saved_size; $saved_size = -s _; $saved_name = $File::Find::name; } find(\&biggest, @ARGV); print "Biggest file $saved_name in @ARGV is $saved_size bytes long.\n";
We use $saved_size
and $saved_name
to keep track of the name and the size of the largest file visited. If we find a file bigger than the largest seen so far, we replace the saved name and size with the current ones. When the find
is done running, the largest file and its size are printed out, rather verbosely. A more general tool would probably just print the filename, its size, or both. This time we used a named function rather than an anonymous one because the function was getting big.
It's simple to change this to find the most recently changed file:
use File::Find; @ARGV = ('.') unless @ARGV; my ($age, $name); sub youngest { return if defined $age && $age > (stat($_))[9]; $age = (stat(_))[9]; $name = $File::Find::name; } find(\&youngest, @ARGV); print "$name " . scalar(localtime($age)) . "\n";
The File::Find module doesn't export its $name
variable, so always refer to it by its fully qualified name. The example in Example 9.2 is more a demonstration of namespace munging than of recursive directory traversal, although it does find all the directories. It makes $name
in our current package an alias for the one in File::Find, which is essentially how Exporter works. Then it declares its own version of find
with a prototype that lets it be called like grep
or map
.
#!/usr/bin/perl -lw # fdirs - find all directories @ARGV = qw(.) unless @ARGV; use File::Find (); sub find(&@) { &File::Find::find } *name = *File::Find::name; find { print $name if -d } @ARGV;
Our find
only calls the find
in File::Find, which we were careful not to import by specifying an ()
empty list in the use
statement. Rather than write this:
find sub { print $File::Find::name if -d }, @ARGV;
we can write the more pleasant:
find { print $name if -d } @ARGV;
The documentation for the standard File::Find and Exporter modules (also in Chapter 7 of Programming Perl); your system's find (1) manpage; Recipe 9.6