Contents:
Common Errors
Programming/System Errors
Environment Variables
Logging and Simulation
CGI Lint-A Debugging/Testing Tool
Set UID/GID Wrapper
The hardest aspect of developing CGI applications on the Web is the testing/debugging phase. The main reason for the difficulty is that applications are being run across a network, with client and server interaction. When there are errors in CGI programs, it is difficult to figure out where they lie.
In this chapter, we will discuss some of the common errors in CGI script design, and what you can do to correct them. In addition, we will look at a debugging/lint tool for CGI applications, called CGI Lint, written exclusively for this book.
Initially, we will discuss some of the simpler errors found in CGI application design. Most CGI designers encounter these errors at one time or another. However, they are extremely easy to fix.
Most servers require that CGI scripts reside in a special directory (/cgi-bin), or have certain file extensions. If you try to execute a script that does not follow the rules for a particular server, the server will simply retrieve and display the document, instead of executing it. For example, if you have the following two lines in your NCSA server resource map configuration file (srm.conf):
ScriptAlias /my-cgi-apps/ /usr/local/bin/httpd_1.4.2/cgi-bin/ AddType application/x-httpd-cgi .cgi .pl
the server will execute only scripts with URLs that either contain the string "/my-cgi-apps," or have a file extension of .pl or .cgi. Take a look at the following URLs and figure out which ones the server will try to execute:
http://some.machine.com/cgi-bin/clock.tcl http://my.machine.edu/my-cgi-apps/clock.pl http://your.machine.org/index.cgi http://their.machine.net/cgi-bin/animation.pl
If you picked the last three, then you are correct! Let's look at why this so. The first one will not get executed because the script is neither in a recognized directory (my-cgi-apps), nor does it have a valid extension (.cgi or .pl). The second one refers to the correct CGI directory, while the last two have valid extensions.
If your CGI application is a script of some sort (a C Shell, Perl, etc.), it must contain a line that begins with #! (a "sharp-bang," or "shebang"), or else the server will not know what interpreter to call to execute the script. You don't have to worry about this if your CGI program is written in C/C++, or any other language that creates a binary. This leads us to another closely related problem, as we will soon see.
The CGI script must be executable by the server. Most servers are set up to run with the user identification (UID) of "nobody," which means that your scripts have to be world executable. The reason for this is that "nobody" has minimal privileges. You can check the permissions of your script on UNIX systems by using the ls command:
% ls -ls /usr/local/bin/httpd_1.4.2/cgi-bin/clock.pl 4 -rwx------ 1 shishir 3624 Aug 17 17:59 clock.pl*
The second field lists the permissions for the file. This field is divided into three parts: the privileges for the owner, the group, and the world (from left to right), with the first letter indicating the type of the file: either a regular file, or a directory. In this example, the owner has sole permission to read, write, and execute the script.
If you want the server (running as "nobody") to be able to execute this script, you have to issue the following command:
% chmod 755 clock.pl 4 -rwx--x--x 1 shishir 3624 Aug 17 17:59 clock.pl*
The chmod command modifies the permissions for the file. The octal code of 711 indicates read (octal 4), write (octal 2), and execute (octal 1) permissions for the owner, and execute permissions for group members and all other members.
All CGI applications must output a valid HTTP header, followed by a blank line, before any other data. In other words, two newline characters have to be output after the header. Here is how the output should look:
Content-type: text/html <HTML> <HEAD><TITLE>Output from CGI Script</TITLE></HEAD> . . .
The headers must be output before any other data, or the server will generate a server error with a status of 500. So make it a habit to output this data as early in the script as possible. To make it easier for yourself, you can use a subroutine like the following to output the correct information:
sub output_MIME_header { local ($type) = @_; print "Content-type: ", $type, "\n\n"; }
Just remember to call it at the beginning of your program (before you output anything else). Another problem related to this topic has to do with how the script executes. If the CGI program has errors, then the interpreter, or compiler, will produce an error message when trying to execute the program. These error messages will inevitably be output before the HTTP header, and the server will complain.
What is the moral of this? Make sure you check your script from the command line before you try to execute it on the Web. If you are using Perl, you can use the -wc switch to check for syntax errors:
% perl -wc clock.pl syntax error in file clock.pl at line 9, at EOF clock.pl had compilation errors.
If there are no errors (but there are warnings), the Perl interpreter will display the following:
% perl -wc clock.pl Possible typo: "opt_g" at clock.pl line 9. Possible typo: "opt_u" at clock.pl line 9. Possible typo: "opt_f" at clock.pl line 9. clock.pl syntax OK
Warnings indicate such things as possible typing errors or use of uninitialized variables. Most of the time, these warnings are benign, but you should still take the time to look into them. Finally, if there are no warnings or errors to be displayed, Perl will output the following:
% perl -wc clock.pl clock.pl syntax OK
So it is extremely important to check to make sure the script runs without any errors on the command line before trying it out on the Web.