CONTENTS

Chapter 5. Authentication

The volume of business Butterthlies, Inc. is doing is stupendous, and naturally our competitors are anxious to look at sensitive information such as the discounts we give our salespeople. We have to seal our site off from their vulgar gaze by authenticating those who log on to it.

5.1 Authentication Protocol

Authentication is simple in principle. The client sends his name and password to Apache. Apache looks up its file of names and encrypted passwords to see whether the client is entitled to access. The webmaster can store a number of clients in a list — either as a simple text file or as a database — and thereby control access person by person.

It is also possible to group a number of people into named groups and to give or deny access to these groups as a whole. So, throughout this chapter, bill and ben are in the group directors, and daphne and sonia are in the group cleaners. The webmaster can require user so and so or require group such and such, or even simply require that visitors be registered users. If you have to deal with large numbers of people, it is obviously easier to group them in this way. To make the demonstration simpler, the password is always theft. Naturally, you would not use so short and obvious a password in real life, or one so open to a dictionary attack.

Each username/password pair is valid for a particular realm, which is named when the passwords are created. The browser asks for a URL; the server sends back "Authentication Required" (code 401) and the realm. If the browser already has a username/password for that realm, it sends the request again with the username/password. If not, it prompts the user, usually including the realm's name in the prompt, and sends that.

Of course, all this is worryingly insecure since the password is sent unencrypted over the Web (base64 encoding is easily reversed), and any malign observer simply has to watch the traffic to get the password — which is as good in his hands as in the legitimate client's. Digest authentication improves on this by using a challenge/handshake protocol to avoid revealing the actual password. In the two earlier editions of this book, we had to report that no browsers actually supported this technique; now things are a bit better. Using SSL (see Chapter 11) also improves this.

5.1.1 site.authent

Examples are found in site.authent. The first Config file, .../conf/httpd1.conf, looks like this:

User webuser
Group webgroup
ServerName www.butterthlies.com
NameVirtualHost 192.168.123.2

<VirtualHost www.butterthlies.com>
ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.authent/htdocs/customers
ServerName www.butterthlies.com
ErrorLog /usr/www/APACHE3/site.authent/logs/error_log
TransferLog /usr/www/APACHE3/site.authent/logs/customers/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin
</VirtualHost>

<VirtualHost sales.butterthlies.com>
ServerAdmin sales_mgr@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.authent/htdocs/salesmen
ServerName sales.butterthlies.com
ErrorLog /usr/www/APACHE3/site.authent/logs/error_log
TransferLog /usr/www/APACHE3/site.authent/logs/salesmen/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin

<Directory /usr/www/APACHE3/site.authent/htdocs/salesmen>
AuthType Basic
AuthName darkness
AuthUserFile /usr/www/APACHE3/ok_users/sales
AuthGroupFile /usr/www/APACHE3/ok_users/groups
require valid-user
</Directory>

</VirtualHost>

What's going on here? The key directive is AuthType Basic in the <Directory ...salesmen> block. This turns Authentication checking on.

5.2 Authentication Directives

From Apache v1.3 on, filenames are relative to theserver rootunless they are absolute. A filename is taken as absolute if it starts with / or, on Win32, if it starts with drive :/. It seems sensible for us to write them in absolute form to prevent misunderstandings. The directives are as follows:

AuthType  

AuthType type
directory, .htaccess
 

AuthType specifies the type of authorization control. Basic was originally the only possible type, but Apache 1.1 introduced Digest, which uses an MD5 digest and a shared secret.

If the directive AuthType is used, we must also use AuthName, AuthGroupFile, and AuthUserFile.

AuthName  

AuthName auth-realm
directory, .htaccess
 

AuthName gives the name of the realm in which the users' names and passwords are valid. If the name of the realm includes spaces, you will need to surround it with quotation marks:

AuthName "sales people"
AuthGroupFile  

AuthGroupFile filename
directory, .htaccess
 

AuthGroupFile has nothing to do with the Group webgroup directive at the top of the Config file. It gives the name of another file that contains group names and their members:

cleaners: daphne sonia
directors: bill ben

We put this into ... /ok_users/groups and set AuthGroupFile to match. The AuthGroupFile directive has no effect unless the require directive is suitably set.

AuthUserFile  

AuthUserFile filename
 

AuthUserFile is a file of usernames and their encrypted passwords. There is quite a lot to this; see the section Section 5.3, Section 5.4, and Section 5.5 later in this chapter.

AuthAuthoritative  

AuthAuthoritative on|off
Default: AuthAuthoritative on
directory, .htaccess
 

Setting the AuthAuthoritative directive explicitly to off allows for both authentication and authorization to be passed on to lower-level modules (as defined in the Config and modules.c files) if there is no user ID or rule matching the supplied user ID. If there is a user ID and/or rule specified, the usual password and access checks will be applied, and a failure will give an Authorization Required reply.

So if a user ID appears in the database of more than one module or if a valid Require directive applies to more than one module, then the first module will verify the credentials, and no access is passed on — regardless of the AuthAuthoritative setting.

A common use for this is in conjunction with one of the database modules, such as mod_auth_db.c, mod_auth_dbm.c, mod_auth_msql.c, and mod_auth_anon.c. These modules supply the bulk of the user-credential checking, but a few (administrator) related accesses fall through to a lower level with a well-protected AuthUserFile.

Default

By default, control is not passed on, and an unknown user ID or rule will result in an Authorization Required reply. Not setting it thus keeps the system secure.

Security

Do consider the implications of allowing a user to allow fall-through in her .htaccess file, and verify that this is really what you want. Generally, it is easier just to secure a single .htpasswd file than it is to secure a database such as mSQL. Make sure that the AuthUserFile is stored outside the document tree of the web server; do not put it in the directory that it protects. Otherwise, clients will be able to download the AuthUserFile.

AuthDBAuthoritative  

AuthDBAuthoritative on|off
Default: AuthDBAuthoritative on
directory, .htaccess
 

Setting the AuthDBAuthoritative directive explicitly to off allows for both authentication and authorization to be passed on to lower-level modules (as defined in the Config and modules.c files) if there is no user ID or rule matching the supplied user ID. If there is a user ID and/or rule specified, the usual password and access checks will be applied, and a failure will give an Authorization Required reply.

So if a user ID appears in the database of more than one module or if a valid Require directive applies to more than one module, then the first module will verify the credentials, and no access is passed on — regardless of the AuthAuthoritative setting.

A common use for this is in conjunction with one of the basic auth modules, such as mod_auth.c. Whereas this DB module supplies the bulk of the user-credential checking, a few (administrator) related accesses fall through to a lower level with a well-protected .htpasswd file.

Default

By default, control is not passed on, and an unknown user ID or rule will result in an Authorization Required reply. Not setting it thus keeps the system secure.

Security

Do consider the implications of allowing a user to allow fall-through in his .htaccess file, and verify that this is really what you want. Generally, it is easier just to secure a single .htpasswd file than it is to secure a database that might have more access interfaces.

AuthDBMAuthoritative  

AuthDBMAuthoritative on|off
Default: AuthDBMAuthoritative on
directory, .htaccess
 

Setting the AuthDBMAuthoritative directive explicitly to off allows for both authentication and authorization to be passed on to lower-level modules (as defined in the Config and modules.c files) if there is no user ID or rule matching the supplied user ID. If there is a user ID and/or rule specified, the usual password and access checks will be applied, and a failure will give an Authorization Required reply.

So if a user ID appears in the database of more than one module or if a valid Require directive applies to more than one module, then the first module will verify the credentials, and no access is passed on — regardless of the AuthAuthoritative setting.

A common use for this is in conjunction with one of the basic auth modules, such as mod_auth.c. Whereas this DBM module supplies the bulk of the user-credential checking, a few (administrator) related accesses fall through to a lower level with a well-protected .htpasswd file.

Default

By default, control is not passed on, and an unknown user ID or rule will result in an Authorization Required reply. Not setting it thus keeps the system secure.

Security

Do consider the implications of allowing a user to allow fall-through in her .htaccess file, and verify that this is really what you want. Generally, it is easier to just secure a single .htpasswd file than it is to secure a database that might have more access interfaces.

require  

require [user user1 user2 ...] [group group1 group2] [valid-user]	
[valid-user] [valid-group]
directory, .htaccess
 

The key directive that throws password checking into action is require.

The argument, valid-user, accepts any users that are found in the password file. Do not mistype this as valid_user, or you will get a hard-to-explain authorization failure when you try to access this site through a browser. This is because Apache does not care what you put after require and will interpret valid_user as a username. It would be nice if Apache returned an error message, but require is usable by multiple modules, and there's no way to determine (in the current API) what values are valid.

file-owner

[Available after Apache 1.3.20] The supplied username and password must be in the AuthUserFile database, and the username must also match the system's name for the owner of the file being requested. That is, if the operating system says the requested file is owned by jones, then the username used to access it through the Web must be jones as well.

file-group

[Available after Apache 1.3.20] The supplied username and password must be in the AuthUserFile database, the name of the group that owns the file must be in the AuthGroupFile database, and the username must be a member of that group. For example, if the operating system says the requested file is owned by group accounts, the group accounts must be in the AuthGroupFile database, and the username used in the request must be a member of that group.

We could say:

require user bill ben simon

to allow only those users, provided they also have valid entries in the password table, or we could say:

require group cleaners

in which case only sonia and daphne can access the site, provided they also have valid passwords and we have set up AuthGroupFile appropriately.

The block that protects ... /cgi-bin could safely be left out in the open as a separate block, but since protection of the ... /salesmen directory only arises when sales.butterthlies.com is accessed, we might as well put the require directive there.

satisfy  

satisfy [any|all]
Default: all
directory, .htaccess
 

satisfy sets access policy if both allow and require are used. The parameter can be either all or any. This directive is only useful if access to a particular area is being restricted by both username/password and client host address. In this case, the default behavior (all) is to require the client to pass the address access restriction and enter a valid username and password. With the any option, the client will be granted access if he either passes the host restriction or enters a valid username and password. This can be used to let clients from particular addresses into a password-restricted area without prompting for a password.

For instance, we want a password from everyone except site 1.2.3.4:

<usual auth setup (realm, files etc>
require valid-user
Satisfy any
order deny,allow
allow from 1.2.3.4
deny from all

5.3 Passwords Under Unix

Authentication of salespeople is managed by the password file sales, stored in /usr/www/APACHE3/ok_users. This is safely above the document root, so that the Bad Guys cannot get at it to mess with it. The file sales is maintained using the Apache utility htpasswd. The source code for this utility is to be found in ... /apache_1.3.1/src/support/htpasswd.c, and we have to compile it with this:

% make htpasswd

htpasswd now links, and we can set it to work. Since we don't know how it functions, the obvious thing is to prod it with this:

% htpasswd -?

It responds that the correct usage is as follows:

Usage:
	htpasswd [-cmdps] passwordfile username
	htpasswd -b[cmdps] passwordfile username password

 -c  Create a new file.
 -m  Force MD5 encryption of the password.
 -d  Force CRYPT encryption of the password (default).
 -p  Do not encrypt the password (plaintext).
 -s  Force SHA encryption of the password.
 -b  Use the password from the command line rather than prompting for it.
On Windows and TPF systems the '-m' flag is used by default.
On all other systems, the '-p' flag will probably not work.

This seems perfectly reasonable behavior, so let's create a user bill with the password "theft" (in real life, you would never use so obvious a password for a character such as Bill of the notorious Butterthlies sales team, because it would be subject to a dictionary attack, but this is not real life):

% htpasswd -m -c ... /ok_users/sales bill

We are asked to type his password twice, and the job is done. If we look in the password file, there is something like the following:

bill:$1$Pd$E5BY74CgGStbs.L/fsoEU0

Add subsequent users (the -c flag creates a new file, so we shouldn't use it after the first one):

% htpasswd ... /ok_users/sales ben

There is no warning if you use the -c flag by accident, so be cautious. Carry on and do the same for sonia and daphne. We gave them all the same password, "theft," to save having to remember different ones later — another dangerous security practice.

The password file ... /ok_users/users now looks something like this:[1]

bill:$1$Pd$E5BY74CgGStbs.L/fsoEU0
ben:$1$/S$hCyzbA05Fu4CAlFK4SxIs0
sonia:$1$KZ$ye9u..7GbCCyrK8eFGU2w.
daphne:$1$3U$CF3Bcec4HzxFWppln6Ai01

Each username is followed by an encrypted password. They are stored like this to protect the passwords because, at least in theory, you cannot work backward from the encrypted to the plain-text version. If you pretend to be Bill and log in using:

$1$Pd$E5BY74CgGStbs.L/fsoEU0

the password gets re-encrypted, becomes something like o09klks23O9RM, and fails to match. You can't tell by looking at this file (or if you can, we'll all be very disappointed) that Bill's password is actually "theft."

From Apache v1.3.14, htpasswd will also generate a password to standard output by using the flag -n.

5.4 Passwords Under Win32

Since Win32 lacks an encryption function, passwords are stored in plain text. This is not very secure, but one hopes it will change for the better. The passwords would be stored in the file named by the AuthUserFile directive, and Bill's entry would be:

bill:theft

except that in real life you would use a better password.

5.5 Passwords over the Web

The security of these passwords on your machine becomes somewhat irrelevant when we realize that they are transmitted unencrypted over the Web. The Base64 encoding used for Basic password transmission keeps passwords from being readable at a glance, but it is very easily decoded. Authentication, as described here, should only be used for the most trivial security tasks. If a compromised password could cause any serious trouble, then it is essential to encrypt it using SSL — see Chapter 11.

5.6 From the Client's Point of View

If you run Apache using httpd1.conf, you will find you can access www.butterthlies.comas before. But if you go to sales.butterthlies.com,you will have to give a username and password.

5.6.1 The Config File

The file is httpd2.conf. These are the relevant bits:

...
AuthType Digest 
AuthName darkness
AuthDigestDomain  http://sales.butterthlies.com
AuthDigestFile /usr/www/APACHE3/ok_digest/digest_users

Run it with ./go 2. At the client end, Microsoft Internet Explorer (MSIE) v5 displayed a password screen decorated with a key and worked as you would expect; Netscape v4.05 asked for a username and password in the usual way and returned error 401 "Authorization required."

5.7 CGI Scripts

Authentication (both Basic and Digest) can also protect CGI scripts. Simply provide a suitable <Directory .../cgi-bin> block.

5.8 Variations on a Theme

You may find that logging in again is a bit more elaborate than you would think. We found that both MSIE and Netscape were annoyingly helpful in remembering the password used for the last login and using it again. To make sure you are really exercising the security features, you have to exit your browser completely each time and reload it to get a fresh crack.

You might like to try the effect of inserting these lines in either of the previous Config files:

....
#require valid-user 
#require user daphne bill 
#require group cleaners 
#require group directors
...

and uncommenting them one line at a time (remember to kill and restart Apache each time).

5.9 Order, Allow, and Deny

So far we have dealt with potential users on an individual basis. We can also allow access from or deny access to specific IP addresses, hostnames, or groups of addresses and hostnames. The commands are allow from and deny from.

The order in which the allow and deny commands are applied is not set by the order in which they appear in your file. The default order is deny then allow : if a client is excluded by deny, it is excluded unless it matches allow. If neither is matched, the client is granted access.

The order in which these commands is applied can be set by the order directive.

allow from  

allow from host host ...
directory, .htaccess
 

The allow directive controls access to a directory. The argument host can be one of the following:

all

All hosts are allowed access.

A (partial) domain name

All hosts whose names match or end in this string are allowed access.

A full IP address

The first one to three bytes of an IP address are allowed access, for subnet restriction.

A network/netmask pair

Network a.b.c.d and netmask w.x.y.z are allowed access, to give finer-grained subnet control. For instance, 10.1.0.0/255.255.0.0.

A network CIDR specification

The netmask consists of nnn high-order 1-bits. For instance, 10.1.0.0/16 is the same as 10.1.0.0/255.255.0.0.

allow from env  

allow from env=variablename ...
directory, .htaccess
 

The allow from env directive controls access by the existence of a named environment variable. For instance:

BrowserMatch ^KnockKnock/2.0 let_me_in
<Directory /docroot>
order deny,allow
deny from all
allow from env=let_me_in
</Directory>

Access by a browser called KnockKnock v2.0 sets an environment variable let_me_in,which in turn triggersallow from.

deny from  

deny from host host ...
directory, .htaccess
 

The deny from directive controls access by host. The argument host can be one of the following:

all

All hosts are denied access.

A (partial) domain name

All hosts whose names match or end in this string are denied access.

A full IP address

The first one to three bytes of an IP address are denied access, for subnet restriction.

A network/netmask pair

Network a.b.c.d and netmask w.x.y.z are denied access, to give finer-grained subnet control. For instance, 10.1.0.0/255.255.0.0.

A network CIDR specification

The netmask consists of nnn high-order 1-bits. For instance, 10.1.0.0/16 is the same as 10.1.0.0/255.255.0.0.

deny from env  

deny from env=variablename ...
directory, .htaccess
 

The deny from env directive controls access by the existence of a named environment variable. For instance:

BrowserMatch ^BadRobot/0.9 go_away
<Directory /docroot>
order allow,deny
allow from all
deny from env=go_away
</Directory>

Access by a browser called BadRobot v0.9 sets an environment variable go_away, which in turn triggers deny from.

Order  

order ordering
directory, .htaccess
 

The ordering argument is one word (i.e., it is not allowed to contain a space) and controls the order in which the foregoing directives are applied. If two order directives apply to the same host, the last one to be evaluated prevails:

deny,allow

The deny directives are evaluated before the allow directives. This is the default.

allow,deny

The allow directives are evaluated before the denys, but the user will still be rejected if a deny is encountered.

mutual-failure

Hosts that appear on the allow list and do not appear on the deny list are allowed access.

We could say:

allow from all

which lets everyone in and is hardly worth writing, or we could say:

allow from 123.156
deny from all

As it stands, this denies everyone except those whose IP addresses happen to start with 123.156. In other words, allow is applied last and carries the day. If, however, we changed the default order by saying:

order allow,deny
allow from 123.156
deny from all

we effectively close the site because deny is now applied last. It is also possible to use domain names, so that instead of:

deny from 123.156.3.5

you could say:

deny from badguys.com 

Although this has the advantage of keeping up with the Bad Guys as they move from one IP address to another, it also allows access by people who control the reverse-DNS mapping for their IP addresses.

A URL can be contain just part of the hostname. In this case, the match is done on whole words from the right. That is, allow from fred.com allows fred.com and abc.fred.com, but not notfred.com.

Good intentions, however, are not enough: before conferring any trust in a set of access rules, you want to test them very thoroughly in private before exposing them to the world. Try the site with as many different browsers as you can muster: Netscape and MSIE can behave surprisingly differently. Having done that, try the site from a public-access terminal — in a library, for instance.

5.10 DBM Files on Unix

Although searching a file of usernames and passwords works perfectly well, it is apt to be rather slow once the list gets up to a couple hundred entries. To deal with this, Apache provides a better way of handling large lists by turning them into a database. You need one (not both!) of the modules that appear in the Config file as follows:

#Module db_auth_module  mod_auth_db.o 
Module dbm_auth_module mod_auth_dbm.o

Bear in mind that they correspond to different directives: AuthDBMUserFile or AuthDBUserFile. A Perl script to manage both types of database, dbmmanage, is supplied with Apache in .../src/support. To decide which type to use, you need to discover the capabilities of your Unix. Explore these by going to the command prompt and typing first:

% man db

and then:

% man dbm

Whichever method produces a manpage is the one you should use. You can also use a SQL database, employing MySQLor a third-party package to manage it.

Once you have decided which method to use, edit the Config file to include the appropriate module, and then type:

% ./Configure

and:

% make

We now have to create a database of our users: bill, ben, sonia, and daphne. Go to ... /apache/src/support, find the utility dbmmanage, and copy it into /usr/local/bin or something similar to put it on your path. This utility may be distributed without execute permission set, so, before attempting to run it, we may need to change the permissions:

% chmod +x dbmmanage

You may find, when you first try to run dbmmanage, that it complains rather puzzlingly that some unnamed file can't be found. Since dbmmanage is a Perl script, this is probably Perl, a text-handling language, and if you have not installed it, you should. It may also be necessary to change the first line of dbmmanage:

#!/usr/bin/perl5

to the correct path for Perl, if it is installed somewhere else.

If you provoke it with dbmmanage -?, you get:

Usage: dbmmanage [enc] dbname command [username [pw [group[,group] [comment]]]]

    where enc is  -d for crypt encryption (default except on Win32, Netware)
                  -m for MD5 encryption (default on Win32, Netware)
                  -s for SHA1 encryption
                  -p for plaintext

    command is one of: add|adduser|check|delete|import|update|view

    pw of . for update command retains the old password
    pw of--(or blank) for update command prompts for the password

    groups or comment of . (or blank) for update command retains old values
    groups or comment of--for update command clears the existing value
    groups or comment of--for add and adduser commands is the empty value

takes the following arguments:
dbmmanage [enc] dbname command [username [pw [group[,group] [comment]]]]

'enc' sets the encryption method:
-d for crypt (default except Win32, Netware)
-m for MD5 (default on Win32, Netware)
-s for SHA1 
-p for plaintext

So, to add our four users to a file /usr/www/APACHE3/ok_dbm/users, we type:

% dbmmanage /usr/www/APACHE3/ok_dbm/users.db adduser bill 
New password:theft
Re-type new password:theft
User bill added with password encrypted to vJACUCNeAXaQ2 using crypt

Perform the same service for ben, sonia, and daphne. The file ... /users is not editable directly, but you can see the results by typing:

% dbmmanage /usr/www/APACHE3/ok_dbm/users view
bill:vJACUCNeAXaQ2
ben:TPsuNKAtLrLSE
sonia:M9x731z82cfDo
daphne:7DBV6Yx4.vMjc

You can build a group file with dbmmanage,but because of faults in the script that we hope will have been rectified by the time readers of this edition use it, the results seem a bit odd. To add the user fred to the group cleaners, type:

% dbmmanage /usr/www/APACHE3/ok_dbm/group add fred cleaners

(Note: do not use adduser.) dbmmanagerather puzzlingly responds with the following message:

User fred added with password encrypted to cleaners using crypt

When we test this with:

% dbmmanage /usr/www/APACHE3/ok_dbm/group view

we see:

fred:cleaners

which is correct, because in a group file the name of the group goes where the encrypted password would go in a password file.

Since we have a similar file structure, we invoke DBM authentication in ... /conf/httpd.conf by commenting out:

#AuthUserFile /usr/www/APACHE3/ok_users/sales
#AuthGroupFile /usr/www/APACHE3/ok_users/groups

and inserting:

AuthDBMUserFile /usr/www/APACHE3/ok_dbm/users 
AuthDBMGroupFile /usr/www/APACHE3/ok_dbm/users

AuthDBMGroupFile is set to the samefile as the AuthDBMUserFile. What happens is that the username becomes the key in the DBM file, and the value associated with the key is password:group. To create a separate group file, a database with usernames as the key and groups as the value (with no colons in the value) would be needed.

5.10.1 AuthDBUserFile

The AuthDBUserFile directive sets the name of a DB file containing the list of users and passwords for user authentication.

AuthDBUserFile filename
directory, .htaccess

filename is the absolute path to the user file.

The user file is keyed on the username. The value for a user is the crypt( )-encrypted password, optionally followed by a colon and arbitrary data. The colon and the data following it will be ignored by the server.

5.10.1.1 Security

Make sure that the AuthDBUserFile is stored outside the document tree of the web server; do not put it in the directory that it protects. Otherwise, clients will be able to download the AuthDBUserFile.

In regards to compatibility, the implementation of dbmopen in the Apache modules reads the string length of the hashed values from the DB data structures, rather than relying upon the string being NULL-appended. Some applications, such as the Netscape web server, rely upon the string being NULL-appended, so if you are having trouble using DB files interchangeably between applications, this may be a part of the problem.

A perl script called dbmmanage is included with Apache. This program can be used to create and update DB-format password files for use with this module.

5.10.2 AuthDBMUserFile

The AuthDBMUserFile directive sets the name of a DBM file containing the list of users and passwords for user authentication.

AuthDBMUserFile filename
directory, .htaccess

filename is the absolute path to the user file.

The user file is keyed on the username. The value for a user is the crypt( )-encrypted password, optionally followed by a colon and arbitrary data. The colon and the data following it will be ignored by the server.

5.10.2.1 Security

Make sure that the AuthDBMUserFile is stored outside the document tree of the web server; do not put it in the directory that it protects. Otherwise, clients will be able to download the AuthDBMUserFile.

In regards to compatibility, the implementation of dbmopen in the Apache modules reads the string length of the hashed values from the DBM data structures, rather than relying upon the string being NULL-appended. Some applications, such as the Netscape web server, rely upon the string being NULL-appended, so if you are having trouble using DBM files interchangeably between applications, this may be a part of the problem.

A perl script called dbmmanage is included with Apache. This program can be used to create and update DBM-format password files for use with this module.

5.11 Digest Authentication

A halfway house between complete encryption and none at all is digest authentication. The idea is that a one-way hash, or digest, is calculated from a password and various other bits of information. Rather than sending the lightly encoded password, as is done in basic authentication, the digest is sent. At the other end, the same function is calculated: if the numbers are not identical, something is wrong — and in this case, since all other factors should be the same, the "something" must be the password.

Digest authentication is applied in Apache to improve the security of passwords. MD5 is a cryptographic hash function written by Ronald Rivest and distributed free by RSA Data Security; with its help, the client and server use the hash of the password and other stuff. The point of this is that although many passwords lead to the same hash value, there is a very small chance that a wrong password will give the right hash value, if the hash function is intelligently chosen; it is also very difficult to construct a password leading to the same hash value (which is why these are sometimes referred to as one-way hashes). The advantage of using the hash value is that the password itself is not sent to the server, so it isn't visible to the Bad Guys. Just to make things more tiresome for them, MD5 adds a few other things into the mix: the URI, the method, and a nonce. A nonce is simply a number chosen by the server and told to the client, usually different each time. It ensures that the digest is different each time and protects against replay attacks.[2] The digest function looks like this:

MD5(MD5(<password>)+":"+<nonce>+":"+MD5(<method>+":"+<uri>))

MD5 digest authentication can be invoked with the following line:

AuthType Digest

This plugs a nasty hole in the Internet's security. As we saw earlier — and almost unbelievably — the authentication procedures discussed up to now send the user's password in barely encoded text across the Web. A Bad Guy who intercepts the Internet traffic then knows the user's password. This is a Bad Thing.

You can either use SSL (see Chapter 11) to encrypt the password or Digest Authentication. Digest authentication works this way:

  1. The client requests a URL.

  2. Because that URL is protected, the server replies with error 401, "Authentication required," and among the headers, it sends a nonce.

  3. The client combines the user's password, the nonce, the method, and the URL, as described previously, then sends the result back to the server. The server does the same thing with the hash of the user's password retrieved from the password file and checks that its result matches.[3]

A different nonce is sent the next time, so that the Bad Guy can't use the captured digest to gain access.

MD5 digest authentication is implemented in Apache, using mod_auth_digest, for two reasons. First, it provides one of the two fully compliant reference HTTP 1.1 implementations required for the standard to advance down the standards track; second, it provides a test bed for browser implementations. It should only be used for experimental purposes, particularly since it makes no effort to check that the returned nonce is the same as the one it chose in the first place.[4] This makes it susceptible to a replay attack.

The httpd.conf file is as follows:

User webuser
Group webgroup
ServerName www.butterthlies.com
ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.digest/htdocs/customers
ErrorLog /usr/www/APACHE3/site.digest/logs/customers/error_log
TransferLog /usr/www/APACHE3/site.digest/logs/customers/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin

<VirtualHost sales.butterthlies.com>
ServerAdmin sales_mgr@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.digest/htdocs/salesmen
ServerName sales.butterthlies.com
ErrorLog /usr/www/APACHE3/site.digest/logs/salesmen/error_log
TransferLog /usr/www/APACHE3/site.digest/logs/salesmen/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin

<Directory /usr/www/APACHE3/site.digest/htdocs/salesmen>
AuthType Digest
AuthName darkness
AuthDigestFile /usr/www/APACHE3/ok_digest/sales
require valid-user
#require group cleaners
</Directory>
</VirtualHost> 	

figs/unix.gif

Go to the Config file (see Chapter 1 ). If the line:

Module digest_module mod_digest.o

figs/unix.gif

is commented out, uncomment it and remake Apache as described previously. Go to the Apache support directory, and type:

% make htdigest
% cp htdigest /usr/local/bin

figs/unix.gif

The command-line syntax for htdigest is:

% htdigest [-c]passwordfile realm user

figs/unix.gif

Go to /usr/www/APACHE3 (or some other appropriate spot) and make the ok_digest directory and contents:

% mkdir ok_digest
% cd ok_digest
% htdigest -c sales darkness bill
Adding password for user bill in realm darkness.
New password: theft
Re-type new password: theft
% htdigest sales darkness ben
...
% htdigest sales darkness sonia
...
% htdigest sales darkness daphne
...

Digest authentication can, in principle, also use group authentication. In earlier editions we had to report that none of it seemed to work with the then available versions of MSIE or Netscape. However, Netscape v6.2.3 and MSIE 6.0.26 seemed happy enough, though we have not tested them thoroughly. Include the line:

LogLevel debug

in the Config file, and check the error log for entries such as the following:

client used wrong authentication scheme: Basic for \

Whether a webmaster used this facility might depend on whether he could control which browsers the clients used.

5.11.1 ContentDigest

This directive enables the generation of Content-MD5 headers as defined in RFC1864 and RFC2068.

ContentDigest on|off
Default: ContentDigest off
server config, virtual host, directory, .htaccess

MD5, as described earlier in this chapter, is an algorithm for computing a "message digest" (sometimes called "fingerprint") of arbitrary-length data, with a high degree of confidence that any alterations in the data will be reflected in alterations in the message digest. The Content-MD5 header provides an end-to-end message integrity check (MIC) of the entity body. A proxy or client may check this header for detecting accidental modification of the entity body in transit. See the following example header:

   Content-MD5: AuLb7Dp1rqtRtxz2m9kRpA==

Note that this can cause performance problems on your server since the message digest is computed on every request (the values are not cached).

Content-MD5 is only sent for documents served by the core and not by any module. For example, SSI documents, output from CGI scripts, and byte-range responses do not have this header.

5.12 Anonymous Access

It sometimes happens that even though you have passwords controlling the access to certain things on your site, you also want to allow guests to come and sample the site's joys — probably a reduced set of joys, mediated by the username passed on by the client's browser. The Apache module mod_auth_anon.c allows you to do this.

We have to say that the whole enterprise seems rather silly. If you want security at all on any part of your site, you need to use SSL. If you then want to make some of the material accessible to everyone, you can give them a different URL or a link from a reception page. However, it seems that some people want to do this to capture visitors' email addresses (using a long-standing convention for anonymous access), and if that is what you want, and if your users' browsers are configured to provide that information, then here's how.

The module should be compiled in automatically — check by looking at Configuration or by running httpd -l. If it wasn't compiled in, you will probably get this unnerving error message:

Invalid command Anonymous

when you try to exercise the Anonymous directive. The Config file in ... /site.anon/conf/httpd.conf is as follows:

User webuser
Group webgroup
ServerName www.butterthlies.com

IdentityCheck	on
NameVirtualHost 192.168.123.2

<VirtualHost www.butterthlies.com>
ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.anon/htdocs/customers
ServerName www.butterthlies.com
ErrorLog /usr/www/APACHE3/site.anon/logs/customers/error_log
TransferLog /usr/www/APACHE3/site.anon/logs/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin
</VirtualHost>

<VirtualHost sales.butterthlies.com>
ServerAdmin sales_mgr@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.anon/htdocs/salesmen
ServerName sales.butterthlies.com
ErrorLog /usr/www/APACHE3/site.anon/logs/error_log
TransferLog /usr/www/APACHE3/site.anon/logs/salesmen/access_log
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin

<Directory /usr/www/APACHE3/site.anon/htdocs/salesmen>
AuthType Basic
AuthName darkness

AuthUserFile /usr/www/APACHE3/ok_users/sales
AuthGroupFile /usr/www/APACHE3/ok_users/groups

require valid-user
Anonymous guest anonymous air-head
Anonymous_NoUserID on
</Directory>

</VirtualHost>

Run go and try accessing http://sales.butterthlies.com /. You should be asked for a password in the usual way. The difference is that now you can also get in by being guest, air-head , or anonymous. You may have to type something in the password field. The Anonymous directives follow.

Anonymous  

Anonymous userid1 userid2 ...
 

The user can log in as any user ID on the list, but must provide something in the password field unless that is switched off by another directive.

Anonymous_NoUserID  

Anonymous_NoUserID [on|off]
Default: off
directory, .htaccess
 

If on, users can leave the ID field blank but must put something in the password field.

Anonymous_LogEmail  

Anonymous_LogEmail [on|off]
Default: on
directory, .htaccess
 

If on, accesses are logged to ... /logs/httpd_log or to the log set by TransferLog.

Anonymous_VerifyEmail  

Anonymous_VerifyEmail [on|off]
Default: off
directory, .htaccess
 

The user ID must contain at least one "@" and one ".".

Anonymous_Authoritative  

Anonymous_Authoritative [on|off]
Default: off
directory, .htaccess
 

If this directive is on and the client fails anonymous authorization, she fails all authorization. If it is off, other authorization schemes will get a crack at her.

Anonymous_MustGiveEmail  

Anonymous_MustGiveEmail [on|off]
Default: on
directory, .htaccess
 

The user must give an email ID as a password.

5.13 Experiments

Run ./go. Exit from your browser on the client machine, and reload it to make sure it does password checking properly (you will probably need to do this every time you make a change throughout this exercise). If you access the salespeople's site again with the user ID guest, anonymous, or air-head and any password you like (fff or 23 or rubbish), you will get access. It seems rather silly, but you must give a password of some sort.

Set:

Anonymous_NoUserID on

This time you can leave both the ID and password fields empty. If you enter a valid username (bill, ben, sonia, or gloria), you must follow through with a valid password.

Set:

Anonymous_NoUserID off
Anonymous_VerifyEmail on
Anonymous_LogEmail on

The effect here is that the user ID has to look something like an email address, with (according to the documentation) at least one "@" and one ".". However, we found that one "." orone "@" would do. Email is logged in the error log, not the access log as you might expect.

Set:

Anonymous_VerifyEmail off
Anonymous_LogEmail off
Anonymous_Authoritative on

The effect here is that if an access attempt fails, it is not now passed on to the other methods. Up to now we have always been able to enter as bill, password theft, but no more. Change the Anonymous section to look like this:

Anonymous_Authoritative off
Anonymous_MustGiveEmail on

Finally:

Anonymous guest anonymous air-head
Anonymous_NoUserID off
Anonymous_VerifyEmail off
Anonymous_Authoritative off
Anonymous_LogEmail on
Anonymous_MustGiveEmail on

The documentation says that Anonymous_MustGiveEmail forces the user to give some sort of password. In fact, it seems to have the same effect as VerifyEmail:. A "." or "@" will do.

5.13.1 Access.conf

In the first edition of this book we said that if you wrote your httpd.conf file as shown earlier, but also created .../conf/access.conf containing directives as innocuous as:

<Directory /usr/www/APACHE3/site.anon/htdocs/salesmen>
</Directory>

security in the salespeople's site would disappear. This bug seems to have been fixed in Apache v1.3.

5.14 Automatic User Information

This is all great fun, but we are trying to run a business here. Our salespeople are logging in because they want to place orders, and we ought to be able to detect who they are so we can send the goods to them automatically. This can be done by looking at the environment variable REMOTE_USER, which will be set to the current username. Just for the sake of completeness, we should note another directive here.

5.14.1 IdentityCheck

The IdentityCheck directive causes the server to attempt to identify the client's user by querying the identd daemon of the client host. (See RFC 1413 for details, but the short explanation is that identd will, when given a socket number, reveal which user created that socket — that is, the username of the client on his home machine.)

IdentityCheck [on|off]

If successful, the user ID is logged in the access log. However, as the Apache manual austerely remarks, you should "not trust this information in any way except for rudimentary usage tracking." Furthermore (or perhaps, furtherless), this extra logging slows Apache down, and many machines do not run an identd daemon, or if they do, they prevent external access to it. Even if the client's machine is running identd, the information it provides is entirely under the control of the remote machine. Many providers find that it is not worth the trouble to use IdentityCheck.

5.15 Using .htaccess Files

We experimented with putting configuration directives in a file called ... /htdocs/.htaccess rather than in httpd.conf. It worked, but how do you decide whether to do things this way rather than the other?

The point of the .htaccess mechanism is that you can change configuration directives without having to restart the server. This is especially valuable on a site where a lot of people maintain their own home pages but are not authorized to bring the server down or, indeed, to modify its Config files. The drawback to the .htaccess method is that the files are parsed for each access to the server, rather than just once at startup, so there is a substantial performance penalty.

The httpd1.conf (from ... /site.htaccess) file contains the following:

User webuser
Group webgroup
ServerName www.butterthlies.com
AccessFileName .myaccess

ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.htaccess/htdocs/salesmen
ErrorLog /usr/www/APACHE3/site.htaccess/logs/error_log
TransferLog /usr/www/APACHE3/site.htaccess/logs/access_log

ServerName sales.butterthlies.com

Access control, as specified by AccessFileName, is now in ... /htdocs/salesmen/.myaccess:

AuthType Basic
AuthName darkness
AuthUserFile /usr/www/APACHE3/ok_users/sales
AuthGroupFile /usr/www/APACHE3/ok_users/groups
require group cleaners

If you run the site with ./go 1 and access http://sales.butterthlies.com /, you are asked for an ID and a password in the usual way. You had better be daphne or sonia if you want to get in, because only members of the group cleaners are allowed.

You can then edit ... /htdocs/salesmen/.myaccess to require group directors instead. Without reloading Apache, you now have to be bill or ben.

5.15.1 AccessFileName

AccessFileName gives authority to the files specified. If a directory is given, authority is given to all files in it and its subdirectories.

AccessFileName filename, filename|direcory and subdirectories ...
Server config, virtual host

Include the following line in httpd.conf:

AccessFileName .myaccess1, myaccess2 ...

Restart Apache (since the AccessFileName has to be read at startup). You might expect that you could limit AccessFileName to .myaccess in some particular directory, but not elsewhere. You can't — it is global (well, more global than per-directory). Try editing ... /conf/httpd.conf to read:

<Directory /usr/www/APACHE3/site.htaccess/htdocs/salesmen>
AccessFileName .myaccess
</Directory>

Apache complains:

Syntax error on line 2 of /usr/www/APACHE3/conf/srm.conf: AccessFileName not allowed 
here

As we have said, this file is found and parsed on each access, and this takes time. When a client requests access to a file /usr/www/APACHE3/site.htaccess/htdocs/salesmen/index.html, Apache searches for the following:

This multiple search also slows business down. You can turn multiple searching off, making a noticeable difference to Apache's speed, with the following directive:

<Directory />
AllowOverride none
</Directory>

It is important to understand that / means the real, root directory (because that is where Apache starts searching) and not the server's document root.

5.16 Overrides

We can do more with overrides than speed up Apache. This mechanism allows the webmaster to exert finer control over what is done in .htaccess files. The key directive is AllowOverride.

5.16.1 AllowOverride

This directive tells Apache which directives in an .htaccess file can override earlier directives.

AllowOverride override1 override2 ...
Directory 

The list of AllowOverride overrides is as follows:

AuthConfig

Allows individual settings of AuthDBMGroupFile, AuthDBMUserFile, AuthGroupFile, AuthName, AuthType, AuthUserFile, and require

FileInfo

Allows AddType, AddEncoding, AddLanguage, AddCharset, AddHandler, RemoveHandler, LanguagePriority, ErrorDocument, DefaultType, Action, Redirect, RedirectMatch, RedirectTemp, RedirectPermanent, PassEnv, SetEnv, UnsetEnv, Header, RewriteEnging, RewriteOptions, RewriteBase, RewriteCond, RewriteRule, CookieTracking, and Cookiename

Indexes

Allows FancyIndexing, AddIcon, AddDescription (see Chapter 7)

Limit

Can limit access based on hostname or IP number

Options

Allows the use of the Options directive (see Chapter 13)

All

All of the previous

None

None of the previous

You might ask: if none switches multiple searches off, which of these options switches it on? The answer is any of them, or the complete absence of AllowOverride. In other words, it is on by default.

To illustrate how this works, look at .../site.htaccess/httpd3.conf, which is httpd2.conf with the authentication directives on the salespeople's directory back in again. The Config filewants cleaners; the .myaccess file wants directors. If we now put the authorization directives, favoring cleaners, back into the Config file:

User webuser
Group webgroup
ServerName www.butterthlies.com
AccessFileName .myaccess

ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/APACHE3/site.htaccess/htdocs/salesmen
ErrorLog /usr/www/APACHE3/site.htaccess/logs/error_log
TransferLog /usr/www/APACHE3/site.htaccess/logs/access_log

ServerName sales.butterthlies.com

#AllowOverride None
AuthType Basic
AuthName darkness
AuthUserFile /usr/www/APACHE3/ok_users/sales
AuthGroupFile /usr/www/APACHE3/ok_users/groups
require group cleaners

and restart Apache, we find that we have to be a director (Bill or Ben). But, if we edit the Config file and uncomment the line:

...
AllowOverride None
...

we find that we have turned off the .htaccess method and that cleaners are back in fashion. In real life, the webmaster might impose a general policy of access control with this:

..
AllowOverride AuthConfig
...
require valid-user
...

The owners of the various pages could then limit their visitors further with this:

require group directors

See .../site.htaccess/httpd4.conf. As can be seen, AllowOverride makes it possible for individual directories to be precisely tailored.

[1]  Note that this version of the file is produced by FreeBSD, so it doesn't use the old-style DES version of the crypt( ) function — instead, it uses one based on MD5, so the password strings may look a little peculiar to you. Different operating environments may produce different results, but each should work in its own environment.

[2]  This is a method in which the Bad Guy simply monitors the Good Guy's session and reuses the headers for her own access. If there were no nonce, this would work every time!

[3]   Which is why MD5 is applied to the password, as well as to the whole thing: the server then doesn't have to store the actual password, just a digest of it.

[4]  It is unfortunate that the nonce must be returned as part of the client's digest authentication header, but since HTTP is a stateless protocol, there is little alternative. It is even more unfortunate that Apache simply believes it! An obvious way to protect against this is to include the time somewhere in the nonce and to refuse nonces older than some threshold.

CONTENTS