This is the blog archive site. For the latest blog articles, click here.
How To Back Up Your Personal ComputerJanuary 30, 2008
NOTE: This article was the basis for the article "Backup for Photographers" that appeared in the Dec-Nov 2009 issue of Photo Technique, in collaboration with Uwe & Bettina Steinmueller. However, I always liked the original article better.
References to specific hardware, software, and services are way out of date, but the overall advice is still good. For an update, see this blog article.
(Updated 31-Jan-2008, 7-Feb-2008, and 8-Feb-2008; updates in red)
I've read countless articles about backup over the years, but not one of them approached the problem properly. They all focus too much on hardware and software, and not enough on what I call independence—getting the copies and the originals separated so they're not subjected to the same threats. And, too often they deal with just one or two threats, most commonly a disk crash, and ignore the others. In fact, not one article even bothered to enumerate the threats to help check that the backup plan deals with each one of them.
I've been thinking for nearly a year about writing my own article that would set matters straight, and now it's time. I blogged a bit about backup last August and promised more; what you'll find here is much more.
I'll cover the principles of backup, list the threats to your data (there are six of them), describe the specific hardware, software, and online facilities available for Windows and OS X, and then outline some specific backup plans. My focus is on personal-computer backup rather than servers or mainframes, although the principles apply universally. Also, I only briefly touch on data privacy (preventing unauthorized use of data); unfortunately, backup usually reduces data privacy, since the more copies there are the more opportunity for someone to gain access to one.
Here's a detailed outline:
What is Backup?
There are some links below to Amazon. If you click on one to buy anything, it won't cost you extra, but I get a fee from Amazon.
Disclosure: I have done some consulting work for Microsoft, and some of that work had to do with storage systems, although not with backup directly. As you will see as you read below, I don't think that association biased my conclusions.
What is Backup?A backup is a copy of data that is sufficiently independent of the original so that destructive events can't affect both at the same time. Backup doesn't prevent destruction of data; it only allows you to recover the data once the destruction has occurred.
A simple example of backup is copying files from a laptop to a CD (the act of copying), putting the CD in your desk at home (independence of location), and having the laptop stolen (the destructive event). Since the CD wasn't stolen, you still have the data that was copied to it. Sure, your laptop is gone and maybe sensitive information has fallen into the wrong hands, but at least you still have the data you saved to the CD. (You cover loss of the laptop with insurance, and you protect sensitive data with encryption, but those subjects are outside the scope of this article.)
To design and implement a backup plan one has to consider the possible threats to data (e.g., theft, electrical surge, fire), the various ways to copy data (e.g., using the Mac Finder or Windows Explorer, using a backup utility, using CD/DVD-burning software), and ways of achieving independence (e.g., online storage, placing media in a safe-deposit box).
Unfortunately, nearly all articles about backup focus on the copying, and ignore the other two (threats and independence). But without evaluating all the threats, there's no way to be sure that the backup will allow you to recover from them, and insufficient independence means that both the data and the backup can be destroyed by the same event. An obvious example is a fire that destroys everything in an office, including any backups kept there.
Additionally, if backup isn't convenient—automatic, ideally—it won't get done often enough to be effective. (It's common after a data loss for someone to regret that their most recent backup is months old.) Restore has to be convenient, too, or else the damage will be compounded. An electrical surge is bad enough, but if your data is unavailable for a week while you restore it from an online service, you still lose a week of productivity.
Since backup is potentially expensive and time-consuming, you also have to consider the importance of your data. Backing up irreplaceable photographs is more important that backing up application preferences, and backing up browser caches or temporary files isn't important at all.
Threats to Your DataThere are only six types of threats that can destroy data:
The Perfect Backup SolutionThe almost-perfect solution is to back up your entire computer every hour to an ultra-reliable, redundant, online storage service such as Amazon's S3. I say "almost" because the backup software you use might have defects. To make it perfect, you need two or more completely independent copying utilities and services.
Unfortunately, there's too much data. For example, if a photographer comes back from a shoot with 20GB of photos (not unusual) and has a T1 line (1.544 megabits per second) operating at 100% efficiency (extremely unusual), it would take 29 hours to copy the photos to an online service. Every 1GB of image data modified (50 photos at 20MB each) would take an additional 1.5 hours or so to upload. That's assuming that there is a T1 line, that it operates as 100% efficiency, that the line isn't being used for anything else, and that the online services can receive and store the data that fast. At a more realistic upload speed, say 500 kilobits per second, it would take more than 3 days to upload the 20GB, by which time the photographer might have shot another 60GB. The backup would never finish.
Oh, I forgot... the photographer needs 2 T1 lines, because we were going to use two independent services.
So, the problem with the perfect backup scheme is that it won't work. We need to back up to hard drives and/or optical disks, and that gets very complicated, as you already know.
Protecting Against ThreatsThe purpose of backup is to recover from damage, not prevent it. Still, it makes sense to reduce the likelihood of damage occurring; restoring is always a chore, and you still have to replace the damaged or stolen equipment. Think of protection as reducing the probability of a restore, rather than reducing the need for backup.
Here are some things you can do:
As I said, even if you do any of the above, it doesn't allow you to get away without backups. It just lessens the chance that you'll need them.
Planning for Backup and Restoring
Convenience vs. IndependenceGenerally, the more convenient a back up method is, the less independence you get. So, you'll need a combination of methods: One or two that are convenient but provide just enough independence to protect against the most common threats, and one or two that are inconvenient but which provide complete independence.
For example, using a background backup utility like Apple's new Time Machine, available with OS X 10.5 (Leopard), is convenient, but since the backup drive has to be within WiFi range and plugged in to power, it doesn't provide enough independence to protect against Surge, Office Destruction, or Regional Disaster. But, since those attacks are so much less common than User Error, Computer Failure, and Disappearance, running Time Machine is a great idea. It's just not the only idea.
For protection against surge, all you need to do is back up to an external disk (probably not with Time Machine) that you can unplug and, for good measure, put into a fireproof media safe. Store that drive in a neighbor's house and you'll protect against Office Destuction as well. Take to your mother's house 25 miles away and you're protected against most Regional Disasters. Copy your irreplaceable files, such as photos, to Amazon's S3 and you're even more completely protected.
You Need a Backup Plan
The way we arrived at the combination of methods in the previous section was to list the six types of threats, list the available backup methods, and then pair them up to ensure that we were covered. The more backup methods available and the more you know about them, the more effectively you can come up with something you can live with. If your plan is too inconvenient, you'll find you're not using it, and then you won't be protected.
As I'll explain below, neither Windows (even Vista) nor OS X comes with sufficient backup software, so you'll need to buy a third-party utility.
You'll have to spend some money, mostly for software and external drives. The software will cost less than $100, a couple of 120GB drives for your most important data will cost about $100 each, and each 500GB drive will cost less than $200. Amazon's S3 service is really cheap, only a few dollars a month. So, for about $500, a little planning work on your part, and a slight change to your work habits you can get almost 100% protection from the six types of threats.
You Need a Restore PlanIt's a safe bet that very few people who do backup have ever tried a restore to see if it works. And, it's not hard to see why: Restoring a complete system is pretty disruptive, and if it doesn't work you will have just wiped out a perfectly good system. You have to put another hard drive in the system so you can safely overwrite it, or wait until you have a new computer. (I have a Windows desktop with six drive bays with handy slide-out trays, so it's very easy to pop in a new drive to test a restore while the primary drive is safely out of the the computer. But my desktop is very unusual, and it's more common today to see computers getting smaller and even more closed up.)
Even without actually doing a complete restore, you should spot-check the backup to ensure that your files are really there. Backup software that won't let you do this, such as Vista's Complete PC Backup, should be avoided.
Your restore plan also has to include a way to replace damaged hardware. If you live near computer stores and they're open when you need them, you might be able to just go out buy what you need when you need it. But if not, and time-to-restore is important, you'll have to have replacement equipment on hand. The equipment really has to be available—if the replacement hard drive has data on it, you won't be able to use it without destroying that data.
After a restore, make sure you don't start running without a backup. For example, suppose you keep a complete, bootable copy of your primary drive on an external drive. If the primary drive fails, you can boot from the external drive, which gets you up and running immediately, losing only a few hours of work. But, if you run that way, you no longer have your backup, since the backup drive has become the primary and the old primary is dead. Instead, you should immediately clone the backup to a replacement primary drive or, if that's not feasible, clone the backup to a second external drive.
Backup HardwareWhile you'll sometimes back up to CDs/DVDs, flash drives, or to an online service, because of the amount of data to be backed up you'll more often use a hard drive. The main choices are a single external drive, a network-attached drive, or a RAID drive (which could also be network-attached). Internal drives aren't good choices because they're not sufficiently independent of the computer being backed up. They're gone if the computer is stolen, and they share the same drive controller, so a controller failure could destroy the data on all the internal drives.
FireWire and USB External DrivesThese drives are great choices for backup. They're available in sizes from about 100GB to 2TB, and some of them are even entirely powered from the USB cable, which makes connecting them and transporting them especially convenient.
If the drive is going to be running all the time, put it out of sight, such as on a shelf under your desk, or on a bookshelf with some books or a family photo in front of it. Figure that a thief won't know your drive is even there and, even if he or she does, nobody wants to heist a $200 drive when there are computers, CDs, and jewelry to take instead, all much easier to fence.
Network Attached Storage (NAS)This is a fancy name for a drive connected to your local network instead of directly to a computer. NAS has been around for years, but only recently have devices been created for home and small-office users that are cheap and easy to set up.
The best-known new NAS device is Apple's new Time Capsule, which is a 500GB or 1TB drive combined with a wireless base station. It allows you to back up, with or without Time Machine (OS X Leopard's built-in backup software), to the drive wirelessly. This means it can be farther from your computers than a directly-connected drive, such as in a closet, on a high bookshelf, or in another room. (Friendly neighbors might even be able to put their Time Capsules in each others' houses, although I haven't tested that arrangement.)
One popular form of NAS is another computer on the network, connected with WiFi (like Time Capsule) or wired. But another computer is much more expensive than just an attached drive, requires much more power and space, and introduces another machine to be set up, booted, maintained, and possibly even backed up. A little box you can just attach and forget about is what's really needed.
To find NAS boxes, go to Amazon and search for "nas drive", "nas disk", "ethernet drive", or "ethernet disk". You'll drives with network attachments, drive cases with network attachments (you have to add the drive), cases with WiFi and with and without drives, and even cheap ($85) devices to which you can attach a USB drive. Here are some examples, none of which I'm recommending, as I haven't tried them:
Some things to watch out for when you buy an NAS device:
RAIDRAID stands for Redundant Array of Inexpensive Disks. The problems with RAID are that it costs extra for the same amount of storage (at least one extra disk and some fancy electronics) and that the disks aren't nearly independent enough.
There are various RAID arrangements, the two most popular of which are RAID-1 (mirroring) and RAID-5 (striping with parity). The idea is that, if a disk fails, it can be removed from the running system and replaced without the system going down or any data being lost. When you replace the defective disk, the RAID system automatically restores, over a period of hours, the data that was on it. (So-called RAID-0, also called striping, isn't really RAID at all because there's no redundancy. The loss of either disk destroys all the data on both disks. RAID-0 is for performance, not for reliability, which is actually reduced.)
RAID is OK if you're using it to replace what otherwise would be a single drive (primary or backup), but it doesn't substitute for backup. Moreover, if your budget is limited, the extra money you spend on RAID might mean that you have less to spend on backup.
To see why RAID doesn't diminish the need for backup, we can do a quick threat analysis, as we do for all other backup methods:
Anyone who thinks RAID provides sufficient protection is focused too narrowly on a single kind of failure, the disk itself, and is ignoring the other threats.
Backup SoftwareOS X and Windows come with backup software, but it's insufficient because either it's inconvenient and error prone (on OS X, without third-party software, the only way to copy files to an removable drive is with the Finder) or because it doesn't meet what I consider the essential requirements for backup software. I'll review those requirements and then briefly review both the backup software that comes with OS X and Windows ("in-the-box") and what you can buy from a third party.
Minimum Requirements for Backup SoftwareThe following requirements must be met by any backup system:
EncryptionAs I said at the start, I'm not going to say much about data privacy (e.g, preventing identity theft), except to note that backup makes the problem worse.
However, many backup programs provide a way to encrypt the backup data, and that's what you should do. If your backup utility doesn't provide encryption, you may be able to encrypt the data anyway by encrypting the volume the data is written to. On a Mac, you can use Disk Utility to create an encrypted disk image, as I explain here. (8-Feb-2008 update) I'm not sure the method in that article is practical. I've updated the article to explain why.
OS X: In-the-Box Backup SoftwareOS X 10.5 (Leopard) comes with only one backup program, Time Machine, and it's a great one that everyone should use. The previous OSes didn't come with anything, although there was always a way to burn CDs and DVDs and to copy files with the Finder.
OS X: Add-On Backup SoftwareIf you subscribe to Apple's .Mac online service you can download a backup utility called Backup. It's primary purpose is to back up online, but you're supposed to be able to use it to back up to external drives and CDs/DVDs. The one time recently when I used Backup to back up to DVDs, it failed.
A much better choice is Super Duper.
I've used Super Duper to make both bootable disk images and partial backups for carrying offsite, but haven't been able to for a few months because
I also used iBackup before I got Super Duper, and it seemed to work fine, although it won't back up a complete system like Super Duper will.
I've tried Retrospect for the Mac, which interested me because it's one of the few backup program that can write CDs/DVDs, but it repeatedly hung trying to write a DVD.
I've just written my own utility to copy folders and files to multiple CDs and DVDs, SpanBurner ($10).
Windows Vista: In-the-Box Backup SoftwareI'm not going to discuss the backup software that comes with XP because (1) with one-exception, noted below, Vista's software is much better, and it's unacceptable, and (2) Vista has been Microsoft's current OS for over a year.
Vista comes with four backup systems:
So, while Vista has four backup systems, even all of them used together don't provide protection from the six threats. Each of them has serious flaws:
All of the Vista back up systems use the Virtual Storage System (VSS) to ensure that only internally-consistent files are backed up, even if an application has them open. OS X doesn't have anything like that; their thinking is that Time Machine will get the file next time. None of my disparaging remarks are intended to disparage VSS, which is an excellent piece of work.
I have heard that XP lacks most of the four backup systems in Vista, but that there is a backup program, not installed by default, that works better than Back Up Files, in that you can at least tell it what you want backed up.
Vista and XP: Add-On Backup SoftwareThis field is so vast and rich that I wouldn't know where to start, so I'll just make two general comments:
For years I used Retrospect on Windows, and it ran reliably. I can't say for sure that it would have restored my entire system, because I never tried it. When I spot-checked for files on the backup, they were always there.
Online Backup ServicesI dismissed online backup earlier because it wasn't perfect, but that doesn't mean it's not a good idea, especially for your most important data, such as irreplaceable digital images.
A widely-used Apple online backup service comes with .Mac. You use the Backup utility, which I mentioned above.
Unfortunately, in two years of trying I have never gotten Backup to work reliably with .Mac, so I don't recommend it. The problem is that when I set it up it runs OK for a few days, then quits with absolutely no notice whatsoever (other than an obscure log entry in a place I never look), and I have no idea how to get it started again. Most recently, with over 20GB free on my .Mac account, it quit claiming it was out of space. As I mentioned above, I haven't gotten it to back up to DVDs, either.
.Mac also comes with a virtual online disk, called iDisk, which you can read and write like an ordinary disk, only much slower. I have had some success dragging things to iDisk with the Finder. I've also used Transmit to create a droplet that sits on my dock. Any files or folders I drag from the Finder to the droplet go off to iDisk, which is something I do every hour or so when I'm working on an important document.
The online backup service I like the best is Amazon S3. Amazon has designed it to be extremely reliable and available, it's widely supported, and it's very cheap. For example, .Mac costs about $6 per GB per month, but S3 is only 15 cents per GB per month, or 13 times cheaper. S3 also charges a transfer fee of 10 cents per GB in, and a bit more coming out; transfer in is what you'd use for backup. If you transfer 50 GB a month, which is a huge amount to transfer online, that's still only $60 per year. (At DSL speeds it would take about a week to transfer 50GB, assuming the line is doing nothing else and your computer stays up for a week.)
Also, as far as I know, .Mac works only with OS X, whereas S3 works with any system, so you can use it to share files across platforms.
I use S3 in three ways:
There are numerous online backup systems for Windows, but probably none are as cheap as S3.
As I've said, S3, or any online storage, is really convenient for small files. For large amounts of data, it's still a good long-term solution, but it takes a lot of time to upload the data. It's worth it, however, for important data that doesn't change. I have thousands of photos on S3 that are now backed up by Amazon in addition to the backup I keep myself (hard drives and DVDs). But it was very troublesome to get them uploaded. Uploads I start generally freeze or quit after about a day. Then I have to see what got transferred and start another upload. It's taken weeks, with many stops and starts, to get the photos uploaded, and I have many gigabytes of new photos that aren't yet uploaded. An ongoing project. It's possible that Jungle Disk's built-in backup utility would do a better job, but I haven't tested it yet for my photo data.
Dealing With the ThreatsSo far I've discussed backup principles (independence being the most important), the six threats, and described hardware and software choices. Now I'll discuss more specifically how each of the threats can be dealt with. To make things a bit simpler, I'll group the six threats into three major categories, because several backup methods deal with more than one threat:
Backup for User Error, Computer Failure, and DisappearanceFor desktops that stay connected to lots of things anyway (network, speakers, keyboard, mouse, external drives, iPods), you should attach a large external drive, via FireWire or USB 2.0, and keep it running all the time. If you turn it off you'll forget to turn it on again and you'll miss backups. You don't have to worry about surge or fire, because they're in a different threat category (see below).
One of the best things about keeping an external drive permanently connected and switched on is that backup to it can be completely automatic. For that to work, the backup software has to run automatically as well. Of course, your computer has to be on for the automatic backup to run, which generally means leaving it on all the time.
For a laptop, you'll have to connect it when you want to back it up and then disconnect it. Backups won't be entirely automatic.
If all your data is on your computer's internal disk, it's best to get an external drive that's the same size or bigger and just copy your entire disk to it each night, or, if your have a Mac running OS X 10.5 (Leopard), use Time Machine, which does copy all the data to the drive, but in the form of versions. (It's straightforward but slow to recreate a crashed internal disk from Time Machine. You boot from the Leopard DVD. Here's an article about how to do it.)
If you have several active disks, there are two cases:
On a Mac, if you don't have Leopard you should use Super Duper to clone the internal disk.
On Windows, there are lots of choices for cloning the internal disks, including Complete PC Backup, which comes with Business, Ultimate, and Enterprise editions of Vista. (But see above for its limitations.)
Other choices for backing up your entire Windows system are Norton Ghost, Retrospect, and Acronis True Image none of which I have tested. (I did use Retrospect for years, but never had to do a restore.)
(31-Jan-2008 update) I've now tried Acronis True Image Home ($37 or so from Amazon), and it seems like a fine program that works well. It can image your main drive, so you can boot directly from the backup, and also back up just the folders you want. It's a much better choice than the backups built into Vista (Complete PC Backup and Back Up Files) because it allows you to verify the backup and restore individual files (without installing VHDmount, which is a pain), and, unlike Back Up Files, it allows you to control exactly what's backed up. It can even send you an email when its finished. There's a 15-day free trial.
Most complete system utilities refresh the backup by only replacing files that have changed. But you don't get to keep previous versions; all you have is the most current complete backup. Also, if the backup fails, you may have nothing, which might mean that you are not backed up at all, which is an unacceptable situation, even for a minute. A practical solution to backup failure, since it's pretty rare, is to put in into the Office Destruction category, which I will discuss shortly. In other words, if the backup fails and the primary disk also fails, you will suffer the pain of recovering from Office Destruction. If that's too much, then take the external backup drive offline once a week (or every other day, or whatever) and replace it with a fresh one, perhaps rotating the drives so you don't have to keep buying new ones.
Backup for SurgeA surge protector is a good idea, but you still have to protect against surge, and that requires that the backup device be completely unplugged. The drive you're using for automatic backup (protection against User Error, Computer Failure, and Disappearance) can't be unplugged, so it doesn't qualify.
If surge is very common in your area (either from power-utility problems or from lightning), you may want to keep a weekly complete backup drive offline, just as you would to protect against both the nightly automatic backup and the main computer failing simultaneously. But, if surge is rare, and especially if you're using a surge protector, you can simply consider surge to be Office Destruction and deal with it that way (next section).
Backup for Office Destruction and Regional DisasterTo protect against these attacks you have to get the backup media offsite, and there are only two ways to do that:
The actual backup hardware and software for offsite backup isn't any different from what you use to protect against User Error, Computer Failure, and Disappearance (see above). The difference is the independence you gain by moving the media or device offsite. The problem is that this requires extra work and isn't automated, so it may not happen. It doesn't do any good if an office fire destroys the drive that's been next to the door waiting for somebody to take it away.
I use two drives: One is offsite (at a friend's house, about 20 miles away), and the other is in my office. About once a week I write a new backup to the drive that's at hand, and then I take it with me when I see my friend and swap drives. That way I have at worst a one-week-old backup 20 miles away. (Of course, a one-hour-old backup is online, hidden underneath my desk, but that's to deal with a different group of threats, as I explained above.)
I'm the least qualified person in the world to give anyone advice on how to develop good organizational habits, but here's some advice anyway. These are things I actually manage to do.
For active work, I don't want to risk losing a week of data. I keep a USB flash drive near my computer (even in the PC Card slot on my laptop), and write to it as I work. I burn CDs and DVDs and put them in my media safe, or take them to my friend's house. Remember, as long as the media or device is small and not plugged in, you're protected against User Error, Computer Failure, Disappearance, and Surge.
My Backup PlanI've mentioned various parts of my own backup plan already, but here's the whole plan organized by form of attack. You should organize your own plan that way, too, to ensure that you're covered.
Evaluating Your Backup PlanMaybe you don't have an explicit plan, but let's call whatever you're doing now your plan. You should consider each of the six threats one-by-one and ask yourself two questions about each:
Raw Conversion: Better Never Than Late April 24, 2008
Scanning in India by Way of California With ScanCafe February 15, 2008
How To Back Up Your Personal Computer January 30, 2008
Every Camera I've Ever Owned January 25, 2008
Sharpening JPEGs for the Web January 4, 2008
Lessons Learned From My Memory Problem December 20, 2007
Hunting Down a Mac Hardware Problem December 20, 2007
Trimming GPS Tracks With GPSTrackViewer November 13, 2007
The World's Shortest Camera Buying Guide September 22, 2007
Transporting and Storing Portable Backup Drives August 26, 2007
"The Luminous Landscape" Teaches Me to Print August 4, 2007
Creating a Google Photo Map (Revised) June 26, 2007
Sony GPS-CS1: Not Good Enough for Geotagging Photos June 24, 2007
Epson P-3000/P-5000 Multimedia Storage Viewer March 10, 2007
Trying Out Infrared January 20, 2007
Stupid Designs Hold Digital Back April 1, 2006
A small collection of my best photos (click the image). You can order prints, too.
The 2004 2nd Edition, a so-called "update" of the 1985 book, which turned out, not surprisingly, to be a re-write. Covers Solaris, Linux, FreeBSD, and Darwin (Mac OS X).
|Entire contents of this web site Copyright 2006-2008 by Marc Rochkind. All rights reserved.|