18.1 Why Make Backups?Backups are important only if you value the work that you do on your computer. If you use your computer as a paperweight, then you don't need to make backups. Years ago, making daily backups was a common practice because computer hardware would often fail for no obvious reason. A backup was the only protection against data loss. Today, hardware failure is still a good reason to back up your system. Hard disk failures are a random process: even though a typical hard disk will now last for five years or more, an organization that has 20 or 30 hard disks can expect a significant drive failure every few months. Drives frequently fail without warning—sometimes only a few days after they have been put into service. It's prudent, therefore, to back up your system on a regular basis. Backups can also be an important tool for securing computers against attacks. Specifically, a full backup allows you to see what an intruder has changed by comparing the files on the computer with the files on the backup. We recommend that you make your first backup of your computer after you install its operating system, load your applications, and install all of the necessary security patches. Not only will this first backup allow you to analyze your system after an attack to see what has been modified, but it will also save the time of rebuilding your system from scratch in the event of a hardware failure. 18.1.1 The Role of BackupsBackups serve many different purposes in a typical organization:
With all of these different uses for backups, it's not surprising that there are so many different forms of backups in use today. Here are just a few:
18.1.2 What Should You Back Up?There are two approaches to computer backup systems:
We recommend the second approach. While some of the information you back up is already "backed up" on the original distribution disks or tapes you used to load the system onto your hard disk, distribution disks or tapes sometimes get lost. Furthermore, as your system ages, programs get installed in the operating system's reserved directories as security holes are discovered and patched, and as other changes occur. If you've ever tried to restore your system after a disaster,[2] you know how much easier the process is when everything is in the same place.
For this reason, we recommend that you store everything from your system (and that means everything necessary to reinstall the system from scratch—every last file) onto backup media at regular, predefined intervals. How often you do this depends on the speed of your backup equipment and the amount of storage space allocated for backups. You might want to do a total backup once a week, or you might want to do it only twice a year. But please do it! 18.1.3 Types of BackupsThere are three basic types of backups:
Full backups and incremental backups work together. A common backup strategy is:
Most administrators of large systems plan and store their backups by disk drive or partition. Different partitions usually require different backup strategies. Some partitions, such as your system partitions (if they are separate), should probably be backed up whenever you make a change to them, on the theory that every change that you make to them is too important to lose. You should use full backups with these systems, rather than incremental backups, because they are usable only in their entirety. Likewise, partitions that are used solely for storing application programs really need to be backed up only when new programs are installed or when the configuration of existing programs is changed. On the other hand, partitions that are used for keeping user files are more amenable to incremental backups. But you may wish to make such backups frequently to minimize the amount of work that would be lost in the event of a failure. When you make incremental backups, use a rotating set of backup disks or tapes.[5] The backup you do tonight shouldn't write over the tape you used for your backup last night. Otherwise, if your computer crashes in the middle of tonight's backup, you would lose the data on the disk, the data in tonight's backup (because it is incomplete), and the data in last night's backup (because you partially overwrote it with tonight's backup). Ideally, perform an incremental backup once a night, and have a different tape for every night of the week, as shown in Figure 18-1. The freeware Amanda backup system and most commercial backup systems automate this practice.
Figure 18-1. An incremental backup18.1.4 Guarding Against Media FailureYou can use two distinct sets of backup tapes to create a tandem backup. With this backup strategy, you create two complete backups (call them A and B) on successive backup occasions. Then, when you perform your first incremental backup, the "A incremental," you back up all of the files that were created or modified after the last A backup, even if they are on the B backup. The second time you perform an incremental backup, the "B incremental," you write out all of the files that were created or modified since the last B backup (even if they are on the A incremental backup). This system protects you against media failure because every file is backed up in two locations. It does, however, double the amount of time that you will spend performing backups. 18.1.4.1 Replace tapes as neededTapes are physical media, and each time you run them through your disk drive they degrade somewhat. Based on your experience with your tape drive and media, you should set a lifetime for each tape. Some vendors establish limits for their tapes (for example, 3 years or 2,000 cycles), but others do not. Be certain to see what the vendor recommends—and don't push that limit. The few pennies you may save by using a tape beyond its useful range will not offset the cost of a major loss. 18.1.4.2 Keep your tape drives cleanIf you make your backups to tape, follow the preventative maintenance schedule of your tape drive vendor, and use an appropriate cleaning cartridge or other process as recommended. Being unable to read a tape because a drive is dirty is inconvenient; discovering that the data you've written to tape is corrupt and no one can read it is a disaster. 18.1.4.3 Verify the backupOn a regular basis you should attempt to restore a few files chosen at random from your backups to make sure that your equipment and software are functioning properly. Not only will this reveal if the backups are comprehensive, but the exercise of doing the restoration may also provide some insight. Stories abound about computer centers that have lost disk drives and gone to their backup tapes, only to find them all unreadable. This scenario can occur as a result of bad tapes, improper backup procedures, faulty software, operator error (see the sidebar), or other problems. At least once a year, you should attempt to restore your entire system completely from backups to ensure that your entire backup system is working properly. Starting with a different, unconfigured computer, see if you can restore all of your tapes and get the new computer operational. Sometimes you will discover that some critical file is missing from your backup tapes. These practice trials are the best times to discover a problem and fix it. Backup nightmares abound. One of this book's reviewers told us about a large Chicago law firm that never bothered to verify backups. They had to wait until their hard drive crashed to learn that their tape drive's stepper motor had stopped stepping and was writing the entire backup to a single track, with later data overwriting earlier data in the same backup. We have also heard many stories about how the tape drive used to make the backup tapes had a speed or alignment problem. Such a problem results in the tapes being readable by the drive that made them, but unreadable by every other tape drive in the world! Be sure that you try loading your tapes, CD-ROMs and disks on other drives when you check them. 18.1.5 How Long Should You Keep a Backup?It may take a week or a month to realize that a file has been deleted. Therefore, you should keep some backup tapes for a week, some for a month, and some for several months. Many organizations make yearly or quarterly backups that they archive indefinitely. After all, tape is cheap. Some organizations decide to keep their yearly or biannual backups "forever"—it's a small investment in the event that they should ever be needed again.
You may wish to keep on your system an index or listing of the names of the files on your backup tapes. This way, if you ever need to restore a file, you can find the right tape to use by scanning the index, rather than by reading every single tape. Having a printed copy of these indexes is also a good idea, especially if you keep the online index on a system that may need to be restored! 18.1.6 Security for BackupsBackups pose a double problem for computer security. On the one hand, your backup tape is your safety net; ideally, it should be kept far away from your computer system so that a local disaster cannot ruin both. On the other hand, the backup contains a complete copy of every file on your system, so the backup itself must be carefully protected. 18.1.6.1 Physical security for backupsIf you use tape drives to make backups, be sure to take the tape out of the drive. One company in San Francisco that made backups every day never bothered removing the cartridge tape from their drive. When their computer was stolen over a long weekend by professional thieves who went through a false ceiling in their office, they lost everything. "The lesson is that the removable storage media is much safer when you remove it from the drive," said an employee after the incident. If possible, avoid storing your backup tapes in the same room as your computer system. Any disaster that might damage or destroy your computers is likely to damage or destroy anything in the immediate vicinity of those computers as well. You may wish to consider investing in a fireproof safe to protect your backup tapes. However, the safe should be placed off site, rather than right next to your computer system. While fireproof safes do protect against fire and theft, they don't protect your data against explosion, many kinds of water damage, and building collapse.
Be certain that any safe you use for storing backups is actually designed for storing computer media. One of the fireproof lockboxes from the neighborhood discount store might not be magnetically safe for your tapes. It might be heat-resistant enough for storing paper, but not for storing magnetic tape, which cannot withstand the same high temperatures. Also, some of the generic fire-resistant boxes for paper are designed with a liquid in the walls that evaporates or foams when exposed to heat to help protect paper inside. Unfortunately, these chemicals can damage the plastic in magnetic tape or CD-ROMs. 18.1.6.2 Write-protect your backupsAfter you have removed a backup tape from a drive, do yourself a favor and flip the write-protect switch. A write-protected tape cannot be accidentally erased. If you are using the tape for incremental backups, you can flip the write-protect switch when you remove the tape, and then flip it again when you reinsert the tape later. If you forget to unprotect the tape, your software will probably give you an error and let you try again. On the other hand, having the tape write-protected will save your data if you accidentally put the wrong tape in the tape drive, or run a program on the wrong tape. 18.1.6.3 Data security for backupsFile protections and passwords protect the information stored on your computer's hard disk, but anybody who has your backup tapes can restore your files (and read the information contained in them) on another computer. For this reason, keep your backup tapes under lock and key. In the early 1990s an employee at a computer magazine pocketed a 4 mm cartridge backup tape that was on the system manager's desk. When the employee got the tape home, he discovered that it contained hundreds of megabytes of personal files, articles in progress, customer and advertising lists, contracts, and detailed business plans for a new venture that the magazine's parent company was planning. The tape also included tens of thousands of dollars worth of computer application programs, many of which were branded with the magazine's name and license numbers. Quite a find for an insider who was setting up a competing publication! When you transfer your backup tapes from your computer to the backup location, protect the tapes at least as well as you normally protect the computers themselves. Letting a messenger carry the tapes from building to building may not be appropriate if the material on the tapes is sensitive. Getting information from a tape by bribing an underpaid courier, posing as the package's intended recipient, or even knocking him unconscious and stealing it, is usually easier and cheaper than breaching a firewall, cracking some passwords, and avoiding detection online. The use of encryption can dramatically improve security for backup tapes. Years ago encryption was done in hardware using special tape drives. Today, backup encryption is largely done with software, which is usually as secure and offers more flexible key management. Unfortunately, this flexibility can cause problems if it is not managed properly. If you do choose to encrypt your backup tapes, be sure that the decryption key is known by more than one person, or escrow the key with a third party. After all, the backups are worthless if the only person with the key forgets it, becomes incapacitated, or quits and refuses to divulge the information. Here are some recommendations for storing a backup tape's encryption key:
18.1.7 Legal IssuesFinally, some firms should be careful about backing up too much information or holding it for too long. Recently, backup tapes have become targets in lawsuits and criminal investigations. Backup tapes can be obtained by subpoena in criminal investigations or during discovery in lawsuits. For this reason, many organizations have adopted "data retention" or "data destruction" policies. These policies typically mandate that all files pertaining to a matter be destroyed a certain time after the matter is closed or the transaction is settled. Frequently, data retention policies are influenced by government regulations. For example, the federal government might mandate that a particular firm retain its records for three years to assist in assuring the firm's compliance with a particular regulation. The firm might then implement a retention policy mirroring this regulatory requirement, and further require that all records (including backup tapes) be destroyed after three years and one day. Many firms (and universities) decide to set limits on data retention of user files to reduce the overhead in doing searches. A typical tactic in civil suits is to seek discovery of all versions of all files that might contain a certain set of keywords, or that were likely to be touched by certain people. The time and effort required to comply with such "fishing expeditions" can be quite extensive, and often is not reimbursed. If the copies don't exist, then there is no need to do the search! However, bear in mind that destruction of information covered under applicable law or destruction of data after receipt of a valid court order is illegal and may result in both fines and jail time. Keep the images of Oliver North and Enron in mind, and remember that wholesale destruction of records is not always appropriate, even if the records are past their prime. To assist in implementing retention policies, you may wish to segregate potentially sensitive data so that it is stored on separate backup tapes. For example, you can store applications on one tape, pending cases on another tape, and library files and archives on a third. In this manner, you can comply with policies and regulations for your datafiles, while keeping other backups according to schedules that are dictated by other motivations. Back up your data, but back up with caution and a plan. 18.1.8 Deciding Upon a Backup StrategyThe key to deciding upon a good strategy for backups is to understand the importance and time-sensitivity of your data. As a start, we suggest that the answers to the following questions will help you plan your backups:
In the following sections, we outline some typical backup strategies for several different situations. 18.1.9 Individual WorkstationMany users do not back up their workstations on a regular basis: they think that backing up their data is too much effort. Unfortunately, they don't consider the effort required to retype everything that they've ever done to recover their records. Here is a simple backup strategy for users with PCs or standalone workstations. 18.1.9.1 Backup plan
This strategy never uses incremental backups; instead, complete backups of a particular set of files are always created. Such project-related backups tend to be incredibly comforting and occasionally valuable. (We found this to be the case in preparation of the third edition of this book—one of us accidentally overwrote the changes another had made, and the backups saved many days of effort!) 18.1.9.2 Retention schedule
18.1.10 Small Network of Workstations and a ServerMost small groups rely on a single server with up to a few dozen workstations. In our example, the organization has a single server with several disks, 15 workstations, and a DAT tape backup drive. The organization doesn't have much money to spend on system administration, so it sets up a system for backing up the most important files over the network to a specially designed server.
18.1.10.1 Backup plan
The daily and hourly backups are done automatically via scripts run by the cron daemon. All monthly and weekly backups are done with shell scripts that are run manually. The scripts both perform the backup and then verify that the data on the tape can be read back, but the backups do not verify that the data on the tape is the same as that on the disk. (No easy verification method exists for the standard dump/restore programs on many Unix systems, although Linux's restore -C can compare data on tape to data on disk.) Automated systems should be inspected on a routine basis to make sure they are still working as planned. You may have the script notify you when completed, sending a list of any errors to a human (in addition to logging them in a file).
18.1.10.2 Retention schedule
18.1.11 Large Service-Based Network with Small BudgetMost large decentralized organizations, such as universities, operate networks with thousands of users and a high degree of autonomy between system operators. The primary goal of the backup system of these organizations is to minimize downtime in the event of hardware failure or network attack; if possible, the system can also restore user files deleted or damaged by accident.
18.1.11.1 Backup planEvery night, the backup staging area is synchronized with the contents of the partitions on its matching primary server using the rsync[6] program. The following morning, the entire disk is copied to a high-speed tape drive.
Using special secondary servers dramatically eases the load of writing backup tapes. This strategy also provides a hot replacement system should the primary server fail. Furthermore, the backup system provides a "safety net" for users who accidentally delete their files—these files can instantly be recovered from the backup system, often without the involvement of the system management. 18.1.11.2 Retention scheduleBackups are retained for two weeks. During that time, users can have their files restored to a special "restoration" area, perhaps for a small fee. Users who want archival backups for longer than two weeks must arrange backups of their own. One of the reasons for this decision is privacy: users should have a reasonable expectation that if they delete their files, the backups will be erased at some point! 18.1.12 Large Service-Based Networks with Large BudgetMany banks and other large firms have requirements for minimum downtime in the event of a failure. Thus, current and complete backups that are ready to go at a moment's notice are vital. In this scheme we use redundant servers, clustered database systems, and elaborate tape farms to provide for adequate backup. The organization sets up two duplicate servers: one in New York City, the other at a facility in upstate Pennsylvania where real estate is cheap (and it is only a 2-hour drive from New York). Each server is configured with a RAID device for its local disk. RAID can be configured for RAID level 1 (disk mirroring) or RAID level 5 (redundancy provided through the use of parity and error-correcting codes). Both the primary site in New York and the secondary site in Pennsylvania run identical software installations. The database servers are configured in tandem so that all transactions sent to the primary machine are simultaneously sent to the secondary machine. Software developed and maintained by the database vendor assures that the two systems are kept in sync, and updates them as necessary. Instead of having software patches, updates, and new systems automatically mirrored from the primary to the secondary, all of these software modifications are carefully planned out, then applied to a test system. After thorough testing with static copies of data, the software is then installed on the secondary machine for testing with near-live data. That installation is then tested. If no adverse impacts are found, the software update is then applied to the primary machine. Development is done on a separate development system. After thorough testing and review, it is deployed in the same manner as with system patches, described above. If a failure of the main system occurs, the remote system is activated. Any pending transactions are replayed on the database, and it then becomes the primary site. The primary site can be brought back online during scheduled downtime. Meanwhile, a disaster recovery plan is initiated whereby the development system (at yet another location) is brought up to mirror the now primary system until the original primary system is brought back on line.
18.1.12.1 Backup planBackups are done from the secondary machine, which presumably has a lower load because it is not serving queries, running only test scripts and receiving database updates. If the backup system is a managed storage solution, such as an EMC Symmetrix, the system takes a snapshot of each disk partition, and it is these snapshots that are backed up. Every morning, encrypted DVD-ROMs are made of the contents of the backup system. The DVDs are then copied, and the copies sent by bonded courier to different branch offices around the country. 18.1.12.2 Retention scheduleThe daily DVDs are saved at the branch offices for seven years under lock and key. This is a total of more than 2,500 DVDs archived at each branch office. At the primary and secondary sites, the DVDs from the end of each month are archived forever. |