[Chapter 14] 14.4 Using fork

14.4 Using fork

Still another way of creating an additional process is to clone the current Perl process using a UNIX primitive called fork. The fork function simply does what the fork (2) system call does: it creates a clone of the current process. This clone (called the child, with the original called the parent) shares the same executable code, variables, and even open files. To distinguish the two processes, the return value from fork is zero for the child, and nonzero for the parent (or undef if the system call fails). The nonzero value received by the parent happens to be the child's process ID. You can check for the return value and act accordingly:

if (!defined($child_pid = fork())) {
    die "cannot fork: $!";
} elsif ($child_pid) {
    # I'm the parent
} else {
    # I'm the child
}

To best use this clone, we need to learn about a few more things that parallel their UNIX namesakes closely: the wait, exit, and exec functions.

The simplest of these is the exec function. It's just like the system function, except that instead of firing off a new process to execute the shell command, Perl replaces the current process with the shell. (In UNIX parlance, Perl exec's the shell.) After a successful exec, the Perl program is gone, having been replaced by the requested program. For example,

exec "date";

replaces the current Perl program with the date command, causing the output of the date to go to the standard output of the Perl program. When the date command finishes, there's nothing more to do because the Perl program is long gone.

Another way of looking at this is that the system function is like a fork followed by an exec, as follows:

# METHOD 1... using system:
system("date");

# METHOD 2... using fork/exec:
unless (fork) {
    # fork returned zero, so I'm the child, and I exec:
    exec("date"); # child process becomes the date command
}

Using fork and exec this way isn't quite right though, because the date command and the parent process are both chugging along at the same time, possibly intermingling their output and generally mucking things up. What we need is a way to tell the parent to wait until the child process completes. That's exactly what the wait function does; it waits until the child (any child, to be precise) has completed. The waitpid function is more discriminating: it waits for a specific child process to complete rather just any kid:

if (!defined($kidpid = fork())) {
    # fork returned undef, so failed
    die "cannot fork: $!";
} elsif ($kidpid == 0) {
                # fork returned 0, so this branch is the child
    exec("date");
                # if the exec fails, fall through to the next statement
    die "can't exec date: $!";
} else { 
                # fork returned neither 0 nor undef, 
                # so this branch is the parent
    waitpid($kidpid, 0);
}

If this all seems rather fuzzy to you, you should probably study up on the fork (2) and exec (2) system calls in a traditional UNIX text, because Perl is pretty much just passing the function calls right down to the UNIX system calls.

The exit function causes an immediate exit from the current Perl process. You'd use this to abort a Perl program from somewhere in the middle, or with fork to execute some Perl code in a process and then quit. Here's a case of removing some files in /tmp in the background using a forked Perl process:

unless (defined ($pid = fork)) {
    die "cannot fork: $!";
} 
unless ($pid) {
    unlink </tmp/badrock.*>;     # blast those files
    exit;                        # the child stops here
} 
                                 # Parent continues here
waitpid($pid, 0);                # must clean up after dead kid

Without the exit, the child process would continue executing Perl code (at the line marked Parent continues here), and that's definitely not what we want.

The exit function takes an optional parameter, which serves as the numeric exit value that can be noticed by the parent process. The default is to exit with a zero value, indicating that everything went OK.


14.3 Using Processes as Filehandles		14.5 Summary of Process Operations