CGI scripts are given predefined environment variables that provide information about the web server as well as the client. Much of this information is drawn from the headers of the HTTP request. In Perl, environment variables are available to your script via the global hash %ENV .
You are free to add, delete, or change any of the values of %ENV. Subprocesses created by your script will also inherit these environment variables, along with any changes you've made to them.
The standard CGI environment variables listed in Table 3-1 should be available on any server supporting CGI. Nonetheless, if you loop through all the keys in %ENV, you will probably not see all the variables listed here. If you recall, some HTTP request headers are used only with certain requests. For example, the Content-length header is sent only with POST requests. The environment variables that map to these HTTP request headers will thus be missing when its corresponding header field is missing. In other words, $ENV{CONTENT_LENGTH} will only exist for POST requests.
Environment Variable |
Description |
---|---|
AUTH_TYPE |
The authentication method used to validate a user. This is blank if the request did not require authentication. |
CONTENT_LENGTH |
The length of the data (in bytes) passed to the CGI program via standard input. |
CONTENT_TYPE |
The media type of the request body, such as "application/x-www-form-urlencoded ". |
DOCUMENT_ROOT |
The directory from which static documents are served. |
GATEWAY_INTERFACE |
The revision of the Common Gateway Interface that the server uses. |
PATH_INFO |
Extra path information passed to a CGI program. |
PATH_TRANSLATED |
The translated version of the path given by the variable PATH_INFO. |
QUERY_STRING |
The query information from requested URL (i.e., the data following "?"). |
REMOTE_ADDR |
The remote IP address of the client making the request; this could be the address of an HTTP proxy between the server and the user. |
REMOTE_HOST |
The remote hostname of the client making the request; this could also be the name of an HTTP proxy between the server and the user. |
REMOTE_IDENT |
The user making the request, as reported by their ident daemon. Only some Unix and IRC users are likely to have this running. |
REMOTE_USER |
The user's login, authenticated by the web server. |
REQUEST_METHOD |
The HTTP request method used for this request. |
SCRIPT_NAME |
The URL path (e.g., /cgi/program.cgi) of the script being executed. |
SERVER_NAME |
The server's hostname or IP address. |
SERVER_PORT |
The port number of the host on which the server is listening. |
SERVER_PROTOCOL |
The name and revision of the request protocol, e.g., "HTTP/1.1". |
SERVER_SOFTWARE |
The name and version of the server software that is answering the client request. |
Any HTTP headers that the web server does not recognize as standard headers, as well as a few other common headers, are also available to your script. The web server follows these rules for creating the name of the environment variable:
The field name is capitalized.
All dashes are converted to underscores.
The prefix HTTP_ is added to the name.
Table 3-2 provides a list of some of the more common of these environment variables.
Environment Variable |
Description |
---|---|
HTTP_ACCEPT |
A list of the media types the client can accept. |
HTTP_ACCEPT_CHARSET |
A list of the character sets the client can accept. |
HTTP_ACCEPT_ENCODING |
A list of the encodings the client can accept. |
HTTP_ACCEPT_LANGUAGE |
A list of the languages the client can accept. |
HTTP_COOKIE |
A name-value pair previously set by the server. |
HTTP_FROM |
The email address of the user making the request; most browsers do not pass this information, since it is considered an invasion of the user's privacy. |
HTTP_HOST |
The hostname of the server from the requested URL (this corresponds to the HTTP 1.1 Host field). |
HTTP_REFERER |
The URL of the document that directed the user to this CGI program (e.g., via a hyperlink or via a form). |
HTTP_USER_AGENT |
The name and version of the client's browser. |
A secure server typically adds many more environment variables for secure connections. Much of this information is based on X.509 and provides information about the server's and possibly the browser's certificates. Because you really won't need to understand these details in order to write CGI scripts, we won't get into X.509 or secure HTTP transactions in this book. For more information, refer to RFC 2511 or the public key infrastructure working group's web site at http://www.imc.org/ietf-pkix/.
The names of the environment variables supplied to your script for secure connections vary by server. The HTTPS environment variable (see Table 3-3) is commonly supported, however, and useful to test whether your connection is secure; unfortunately its values vary between servers. Refer to your server's documentation for more information or use Example 3-1 or Example 3-2 to generate data for your server.
Environment Variable |
Description |
---|---|
HTTPS |
This variable can be used as a flag to indicate whether the connection is secure; its values vary by server (e.g., "ON" or "on" when secure and blank or "OFF" when not). |
Finally, the web server may provide additional environment variables beyond those mentioned in this section. Most web servers also allow the administrator to add environment variables via a configuration file. You might take advantage of this feature if you have several CGI scripts that all share common configuration information, such as the name of the database server to connect to. Having the variable defined once in the web server's configuration file makes it easy to change later.
Because browsers and web servers may provide additional environment variables to your script, it's often helpful to have a list of environment variables that is specific to your web server. Example 3-1 shows a short script that is easy to remember and type in when you find yourself working on a new system. It generates a handy list of environment variables specific to that web server. Remember that the browser may also affect this list. For example, HTTP_COOKIE will only appear if the browser supports cookies, if cookies have not been disabled, and if the browser had received a previous request from this web server to set a cookie.
#!/usr/bin/perl -wT # Print a formatted list of all the environment variables use strict; print "Content-type: text/html\n\n"; my $var_name; foreach $var_name ( sort keys %ENV ) { print "<P><B>$var_name</B><BR>"; print $ENV{$var_name}; }
This simply produces an alphabetic list of the environment variable names and their values, shown in Figure 3-2.
Because this is simply a quick-and-dirty script, we omitted some details that should be included in production CGI scripts, and which are included in the other examples. For example, we did not print a valid HTML document (it is missing the enclosing HTML, HEADER, and BODY tags). This should certainly be added if the script were to grow beyond a few lines or if you intended for people other than yourself to use it.
Example 3-2 shows a more elaborate version that displays all of the environment variables that CGI and your web server define, along with a brief explanation of the standard variables.
#!/usr/bin/perl -wT use strict; my %env_info = ( SERVER_SOFTWARE => "the server software", SERVER_NAME => "the server hostname or IP address", GATEWAY_INTERFACE => "the CGI specification revision", SERVER_PROTOCOL => "the server protocol name", SERVER_PORT => "the port number for the server", REQUEST_METHOD => "the HTTP request method", PATH_INFO => "the extra path info", PATH_TRANSLATED => "the extra path info translated", DOCUMENT_ROOT => "the server document root directory", SCRIPT_NAME => "the script name", QUERY_STRING => "the query string", REMOTE_HOST => "the hostname of the client", REMOTE_ADDR => "the IP address of the client", AUTH_TYPE => "the authentication method", REMOTE_USER => "the authenticated username", REMOTE_IDENT => "the remote user is (RFC 931): ", CONTENT_TYPE => "the media type of the data", CONTENT_LENGTH => "the length of the request body", HTTP_ACCEPT => "the media types the client accepts", HTTP_USER_AGENT => "the browser the client is using", HTTP_REFERER => "the URL of the referring page", HTTP_COOKIE => "the cookie(s) the client sent" ); print "Content-type: text/html\n\n"; print <<END_OF_HEADING; <HTML> <HEAD> <TITLE>A List of Environment Variables</TITLE> </HEAD> <BODY> <H1>CGI Environment Variables</H1> <TABLE BORDER=1> <TR> <TH>Variable Name</TH> <TH>Description</TH> <TH>Value</TH> </TR> END_OF_HEADING my $name; # Add additional variables defined by web server or browser foreach $name ( keys %ENV ) { $env_info{$name} = "an extra variable provided by this server" unless exists $env_info{$name}; } foreach $name ( sort keys %env_info ) { my $info = $env_info{$name}; my $value = $ENV{$name} || "<I>Not Defined</I>"; print "<TR><TD><B>$name</B></TD><TD>$info</TD><TD>$value</TD></TR>\n"; } print "</TABLE>\n"; print "</BODY></HTML>\n";
The %env_info hash contains the standard environment variable names and their descriptions. The while loop iterates over %ENV with the each command to add any additional environment variables defined by the current web server. Then the foreach loop iterates through the combined list and displays the name, description, and value of each environment variable. Figure 3-3 shows what the output will look in a browser window.
This covers most of CGI input, but we have not discussed how to read the message body for POST requests. We will return to that topic when we discuss forms in the next chapter. Right now, let's look at CGI output.
Copyright © 2001 O'Reilly & Associates. All rights reserved.