Create a function that takes a string to validate and returns true if the string passes a check and false if it doesn't. Inside the function, use regular expressions and comparisons to check the data. For example, Example 9-1 shows the pc_validate_zipcode( ) function, which validates a U.S. Zip Code.
function pc_validate_zipcode($zipcode) { return preg_match('/^[0-9]{5}([- ]?[0-9]{4})?$/', $zipcode); }
Here's how to use it:
if (pc_validate_zipcode($_REQUEST['zipcode'])) { // U.S. Zip Code is okay, can proceed process_data(); } else { // this is not an okay Zip Code, print an error message print "Your ZIP Code is should be 5 digits (or 9 digits, if you're "; print "using ZIP+4)."; print_form(); }
Deciding what constitutes valid and invalid data is almost more of a philosophical task than a straightforward matter of following a series of fixed steps. In many cases, what may be perfectly fine in one situation won't be correct in another.
The easiest check is making sure the field isn't blank. The empty( ) function best handles this problem.
Next come relatively easy checks, such as the case of a U.S. Zip Code. Usually, a regular expression or two can solve these problems. For example:
/^[0-9]{5}([- ]?[0-9]{4})?$/
finds all valid U.S. Zip Codes.
Sometimes, however, coming up with the correct regular expression is difficult. If you want to verify that someone has entered only two names, such as "Alfred Aho," you can check against:
/^[A-Za-z]+ +[A-Za-z]+$/
However, Tim O'Reilly can't pass this test. An alternative is /^\S+\s+\S+$/; but then Donald E. Knuth is rejected. So think carefully about the entire range of valid input before writing your regular expression.
In some instances, even with regular expressions, it becomes difficult to check if the field is legal. One particularly popular and tricky task is validating an email address, as discussed in Recipe 13.7. Another is how to make sure a user has correctly entered the name of her U.S. state. You can check against a listing of names, but what if she enters her postal service abbreviation? Will MA instead of Massachusetts work? What about Mass.?
One way to avoid this issue is to present the user with a dropdown list of pregenerated choices. Using a select element, users are forced by the form's design to select a state in the format that always works, which can reduce errors. This, however, presents another series of difficulties. What if the user lives some place that isn't one of the choices? What if the range of choices is so large this isn't a feasible solution?
There are a number of ways to solve these types of problems. First, you can provide an "other" option in the list, so that a non-U.S. user can successfully complete the form. (Otherwise, she'll probably just pick a place at random, so she can continue using your site.) Next, you can divide the registration process into a two-part sequence. For a long list of options, a user begins by picking the letter of the alphabet his choice begins with; then, a new page provides him with a list containing only the choices beginning with that letter.
Finally, there are even trickier problems. What do you do when you want to make sure the user has correctly entered information, but you don't want to tell her you did so? A situation where this is important is a sweepstakes; in a sweepstakes, there's often a special code box on the entry form in which a user enters a string — AD78DQ — from an email or flier she's received. You want to make sure there are no typos, or your program won't count her as a valid entrant. You also don't want to allow her to just guess codes, because then she could try out those codes and crack the system.
The solution is to have two input boxes. A user enters her code twice; if the two fields match, you accept the data as legal and then (silently) validate the data. If the fields don't match, you reject the entry and have the user fix it. This procedure eliminates typos and doesn't reveal how the code validation algorithm works; it can also prevent misspelled email addresses.
Finally, PHP performs server-side validation. Server-side validation requires that a request be made to the server, and a page returned in response; as a result, it can be slow. It's also possible to do client-side validation using JavaScript. While client-side validation is faster, it exposes your code to the user and may not work if the client doesn't support JavaScript or has disabled it. Therefore, you should always duplicate all client-side validation code on the server.
Recipe 13.7 for a regular expression for validating email addresses; Chapter 7, "Validation on the Server and Client," of Web Database Applications with PHP and MySQL (Hugh Williams and David Lane, O'Reilly).
Copyright © 2003 O'Reilly & Associates. All rights reserved.