mod_perl is an Apache server extension that embeds Perl within Apache, providing a Perl interface to the Apache API. This allows us to develop full-blown Apache modules in Perl to handle particular stages of a client request. It was written by Doug MacEachern, and since it was introduced, its popularity has grown quickly.
The most popular Apache/Perl module is Apache::Registry, which emulates the CGI environment, allowing us to write CGI applications that run under mod_perl. Since Perl is embedded within the server, we avoid the overhead of starting up an external interpreter. In addition, we can load and compile all the external Perl modules we want to use at server startup, and not during the execution of our application. Apache::Registry also caches compiled versions of our CGI applications, thereby providing a further boost. Users have reported performance gains of up to 2000 percent in their CGI applications using a combination of mod_perl and Apache::Registry.
Apache::Registry is a response handler, which means that it is responsible for generating the response that will be sent back to the client. It forms a layer over our CGI applications; it executes our applications and sends the resulting output back to the client. If you don't want to use Apache::Registry, you can implement your own response handler to take care of the request. However, these handlers are quite different from standard CGI scripts, so we won't discuss how to create handlers with mod_perl. To learn about handlers along with anything else you might want to know about mod_perl, refer to Writing Apache Modules with Perl and C by Lincoln Stein and Doug MacEachern (O'Reilly & Associates, Inc.).
Before we go any further, let's install mod_perl. You can obtain it from CPAN at http://www.cpan.org/modules/by-module/Apache/. The Apache namespace is used by modules that are specific to mod_perl. The installation is relatively simple and should proceed well:
$ cd mod_perl-1.22 $ perl Makefile.PL \ > APACHE_PREFIX=/usr/local/apache \ > APACHE_SRC=../apache-1.3.12/src \ > DO_HTTPD=1 \ > USE_APACI=1 \ > EVERYTHING=1 $ make $ make test $ su # make install
Refer to the installation directions that came with Apache and mod_perl if you want to perform a custom installation. If you're not interested in possibly developing and implementing the various Apache/Perl handlers, then you do not need the EVERYTHING=1 directive, in which case, you can implement only a PerlHandler.
Once that's complete, we need to configure Apache. Here's a simple setup:
PerlRequire /usr/local/apache/conf/startup.pl PerlTaintCheck On PerlWarn On Alias /perl/ /usr/local/apache/perl/ <Location /perl> SetHandler perl-script PerlSendHeader On PerlHandler Apache::Registry Options ExecCGI </Location>
As you can see, this is very similar to the manner in which we configured FastCGI. We use the PerlRequire directive to execute a startup script. Generally, this is where you would pre-load all the modules that you intend to use (see Example 17-3).
However, if you are interested in loading only a small set of modules (a limit of ten), you can use the PerlModule directive instead:
PerlModule CGI DB_File MLDBM Storable
For Apache::Registry to honor taint mode and warnings, we must add directive the PerlTaintMode and PerlWarn directives. Otherwise, they won't be enabled. We do this globally. Then we configure the directory we are setting up to run our scripts.
All requests for resources in the /perl directory go through the perl-script (mod_perl) handler, which then passes the request off to the Apache::Registry module. We also need to enable the ExecCGI option. Otherwise, Apache::Registry will not execute our CGI applications.
Now, here's a sample configuration file in Example 17-3.
#!/usr/bin/perl -wT use Apache::Registry; use CGI; ## any other modules that you may need for your ## other mod_perl applications running ... print "Finished loading modules. Apache is ready to go!\n"; 1;
It is really a very simple program, which does nothing but load the modules. We also want Apache::Registry to be pre-loaded since it'll be handling all of our requests. A thing to note here is that each of Apache's child processes will have access to these modules.
If we do not load a module at startup, but use it in our applications, then that module will have to be loaded once for each child process. The same applies for our CGI applications running under Apache::Registry. Each child process compiles and caches the CGI application once, so the first request that is handled by that child will be relatively slow, but all subsequent requests will be much faster.
In general, Apache::Registry, does provide a good emulation of a standard CGI environment. However, there are some differences you need to keep in mind:
The same precautions that apply to FastCGI apply to mod_perl, namely, always use strict mode and it helps to enable warnings. You should also always initialize your variables and not assume they are empty when your script starts; the warning flag will tell you when you are using undefined values. Your environment is not cleaned up with you when your script ends, so variables that do not go out of scope and global variables remain defined the next time your script is called.
Due to the fact that your code is only compiled once and then cached, lexical variables in the body of your scripts that you access within your subroutines create closures. For example, it is possible to do this in a standard CGI script:
my $q = new CGI; check_input( ); . . sub check_input { unless ( $q->param( "email" ) ) { error( $q, "You didn't supply an email address." ); } . .
Note that we do not pass our CGI object to check_input . However, the variable is still visible to us from within that subroutine. This works fine in CGI. It will create very subtle, confusing errors in mod_perl. The problem is that the first time the script is run on a particular Apache child process, the value of the CGI object becomes trapped in the cached copy of check_input. All future calls to that same Apache child process will reuse the original value of the CGI object within check_input. The solution is to pass $q to check_input as a parameter or else change $q from a lexical to a global local variable.
If you are not familiar with closures (they are not commonly used in Perl), refer to the perlsub manpage or Programming Perl.
The constant module creates constants by defining them internally as subroutines. Since Apache::Registry creates a persistent environment, using constants in this manner can produce the following warnings in the error log when these scripts are recompiled:
Constant subroutine FILENAME redefined at ...
It will not affect the output of your scripts, so you can just ignore these warnings. Another alternative is to simply make them global variables instead; the closure issue is not an problem for variables whose values never change. This warning should no longer appear for unmodified code in Perl 5.004_05 and higher.
Regular expressions that are compiled with the o flag will remain compiled across all requests for that script, not just for one request.
File age functions, such as -M, calculate their values relative to the time the application began, but with mod_perl, that is typically the time the server begins. You can get this value from $^T . Thus adding (time - $^T) to the age of a file will yield the true age.
BEGIN blocks are executed once when your script is compiled, not at the beginning of each request. However, END blocks are executed at the end of each request, so you can use these as you normally would.
__END__ and __DATA__ cannot be used within CGI scripts with Apache::Registry. They will cause your scripts to fail.
Typically, your scripts should not call exit in mod_perl, or it will cause Apache to exit instead (remember, the Perl interpreter is embedded within the web server). However, Apache::Registry overrides the standard exit command so it is safe for these scripts.
If it's too much of a hassle to convert your application to run effectively under Apache::Registry, then you should investigate the Apache::PerlRun module. This module uses the Perl interpreter embedded within Apache, but doesn't cache compiled versions of your code. As a result, it can run sloppy CGI scripts, but without the full performance improvement of Apache::Registry. It will, nonetheless, be faster than a typical CGI application.
Increasing the speed of CGI scripts is only part of what mod_perl can do. It also allows you do write code in Perl that interacts with the Apache response cycle, so you can do things like handle authentication and authorization yourself. A full discussion of mod_perl is certainly beyond the scope of this book. If you want to learn more about mod_perl, then you should definitely start with Stas Bekman's mod_perl guide, available at http://perl.apache.org/guide/. Then look at Writing Apache Modules with Perl and C, which provides a very thorough, although technical, overview of mod_perl.
Copyright © 2001 O'Reilly & Associates. All rights reserved.