NOTE: This article was the basis for the article "Backup for Photographers" that appeared in the Dec-Nov 2009 issue of Photo Technique, in collaboration with Uwe & Bettina Steinmueller. However, I always liked the original article better.
References to specific hardware, software, and services are way out of date, but the overall advice is still good. For an update, see this blog article.
(Updated 31-Jan-2008, 7-Feb-2008, and 8-Feb-2008; updates in red)
I've read countless articles about backup over the years, but not one of them approached the problem properly. They all focus too much on hardware and software, and not enough on what I call independence—getting the copies and the originals separated so they're not subjected to the same threats. And, too often they deal with just one or two threats, most commonly a disk crash, and ignore the others. In fact, not one article even bothered to enumerate the threats to help check that the backup plan deals with each one of them.
I've been thinking for nearly a year about writing my own article that would set matters straight, and now it's time. I blogged a bit about backup last August and promised more; what you'll find here is much more.
I'll cover the principles of backup, list the threats to your data (there are six of them), describe the specific hardware, software, and online facilities available for Windows and OS X, and then outline some specific backup plans. My focus is on personal-computer backup rather than servers or mainframes, although the principles apply universally. Also, I only briefly touch on data privacy (preventing unauthorized use of data); unfortunately, backup usually reduces data privacy, since the more copies there are the more opportunity for someone to gain access to one.
Here's a detailed outline:
What is Backup?
Threats to Your Data
The Perfect Backup Solution
Protecting Against Threats
Planning for Backup and Restoring
Convenience vs. Independence
You Need a Backup Plan
You Need a Restore Plan
FireWire and USB External Drives
Network Attached Storage (NAS)
Minimum Requirements for Backup Software
OS X: In-the-Box Backup Software
OS X: Add-On Backup Software
Windows Vista: In-the-Box Backup Software
Vista and XP: Add-On Backup Software
Online Backup Services
Dealing With the Threats
Backup for User Error, Computer Failure, and Disappearance
Backup for Surge
Backup for Office Destruction and Regional Disaster
My Backup Plan
Evaluating Your Backup Plan
There are some links below to Amazon. If you click on one to buy anything, it won't cost you extra, but I get a fee from Amazon.
Disclosure: I have done some consulting work for Microsoft, and some of that work had to do with storage systems, although not with backup directly. As you will see as you read below, I don't think that association biased my conclusions.
A simple example of backup is copying files from a laptop to a CD (the act of copying), putting the CD in your desk at home (independence of location), and having the laptop stolen (the destructive event). Since the CD wasn't stolen, you still have the data that was copied to it. Sure, your laptop is gone and maybe sensitive information has fallen into the wrong hands, but at least you still have the data you saved to the CD. (You cover loss of the laptop with insurance, and you protect sensitive data with encryption, but those subjects are outside the scope of this article.)
To design and implement a backup plan one has to consider the possible threats to data (e.g., theft, electrical surge, fire), the various ways to copy data (e.g., using the Mac Finder or Windows Explorer, using a backup utility, using CD/DVD-burning software), and ways of achieving independence (e.g., online storage, placing media in a safe-deposit box).
Unfortunately, nearly all articles about backup focus on the copying, and ignore the other two (threats and independence). But without evaluating all the threats, there's no way to be sure that the backup will allow you to recover from them, and insufficient independence means that both the data and the backup can be destroyed by the same event. An obvious example is a fire that destroys everything in an office, including any backups kept there.
Additionally, if backup isn't convenient—automatic, ideally—it won't get done often enough to be effective. (It's common after a data loss for someone to regret that their most recent backup is months old.) Restore has to be convenient, too, or else the damage will be compounded. An electrical surge is bad enough, but if your data is unavailable for a week while you restore it from an online service, you still lose a week of productivity.
Since backup is potentially expensive and time-consuming, you also have to consider the importance of your data. Backing up irreplaceable photographs is more important that backing up application preferences, and backing up browser caches or temporary files isn't important at all.
Unfortunately, there's too much data. For example, if a photographer comes back from a shoot with 20GB of photos (not unusual) and has a T1 line (1.544 megabits per second) operating at 100% efficiency (extremely unusual), it would take 29 hours to copy the photos to an online service. Every 1GB of image data modified (50 photos at 20MB each) would take an additional 1.5 hours or so to upload. That's assuming that there is a T1 line, that it operates as 100% efficiency, that the line isn't being used for anything else, and that the online services can receive and store the data that fast. At a more realistic upload speed, say 500 kilobits per second, it would take more than 3 days to upload the 20GB, by which time the photographer might have shot another 60GB. The backup would never finish.
Oh, I forgot... the photographer needs 2 T1 lines, because we were going to use two independent services.
So, the problem with the perfect backup scheme is that it won't work. We need to back up to hard drives and/or optical disks, and that gets very complicated, as you already know.
Here are some things you can do:
As I said, even if you do any of the above, it doesn't allow you to get away without backups. It just lessens the chance that you'll need them.
For example, using a background backup utility like Apple's new Time Machine, available with OS X 10.5 (Leopard), is convenient, but since the backup drive has to be within WiFi range and plugged in to power, it doesn't provide enough independence to protect against Surge, Office Destruction, or Regional Disaster. But, since those attacks are so much less common than User Error, Computer Failure, and Disappearance, running Time Machine is a great idea. It's just not the only idea.
For protection against surge, all you need to do is back up to an external disk (probably not with Time Machine) that you can unplug and, for good measure, put into a fireproof media safe. Store that drive in a neighbor's house and you'll protect against Office Destuction as well. Take to your mother's house 25 miles away and you're protected against most Regional Disasters. Copy your irreplaceable files, such as photos, to Amazon's S3 and you're even more completely protected.
The way we arrived at the combination of methods in the previous section was to list the six types of threats, list the available backup methods, and then pair them up to ensure that we were covered. The more backup methods available and the more you know about them, the more effectively you can come up with something you can live with. If your plan is too inconvenient, you'll find you're not using it, and then you won't be protected.
As I'll explain below, neither Windows (even Vista) nor OS X comes with sufficient backup software, so you'll need to buy a third-party utility.
You'll have to spend some money, mostly for software and external drives. The software will cost less than $100, a couple of 120GB drives for your most important data will cost about $100 each, and each 500GB drive will cost less than $200. Amazon's S3 service is really cheap, only a few dollars a month. So, for about $500, a little planning work on your part, and a slight change to your work habits you can get almost 100% protection from the six types of threats.
Even without actually doing a complete restore, you should spot-check the backup to ensure that your files are really there. Backup software that won't let you do this, such as Vista's Complete PC Backup, should be avoided.
Your restore plan also has to include a way to replace damaged hardware. If you live near computer stores and they're open when you need them, you might be able to just go out buy what you need when you need it. But if not, and time-to-restore is important, you'll have to have replacement equipment on hand. The equipment really has to be available—if the replacement hard drive has data on it, you won't be able to use it without destroying that data.
After a restore, make sure you don't start running without a backup. For example, suppose you keep a complete, bootable copy of your primary drive on an external drive. If the primary drive fails, you can boot from the external drive, which gets you up and running immediately, losing only a few hours of work. But, if you run that way, you no longer have your backup, since the backup drive has become the primary and the old primary is dead. Instead, you should immediately clone the backup to a replacement primary drive or, if that's not feasible, clone the backup to a second external drive.
If the drive is going to be running all the time, put it out of sight, such as on a shelf under your desk, or on a bookshelf with some books or a family photo in front of it. Figure that a thief won't know your drive is even there and, even if he or she does, nobody wants to heist a $200 drive when there are computers, CDs, and jewelry to take instead, all much easier to fence.
The best-known new NAS device is Apple's new Time Capsule, which is a 500GB or 1TB drive combined with a wireless base station. It allows you to back up, with or without Time Machine (OS X Leopard's built-in backup software), to the drive wirelessly. This means it can be farther from your computers than a directly-connected drive, such as in a closet, on a high bookshelf, or in another room. (Friendly neighbors might even be able to put their Time Capsules in each others' houses, although I haven't tested that arrangement.)
One popular form of NAS is another computer on the network, connected with WiFi (like Time Capsule) or wired. But another computer is much more expensive than just an attached drive, requires much more power and space, and introduces another machine to be set up, booted, maintained, and possibly even backed up. A little box you can just attach and forget about is what's really needed.
To find NAS boxes, go to Amazon and search for "nas drive", "nas disk", "ethernet drive", or "ethernet disk". You'll drives with network attachments, drive cases with network attachments (you have to add the drive), cases with WiFi and with and without drives, and even cheap ($85) devices to which you can attach a USB drive. Here are some examples, none of which I'm recommending, as I haven't tried them:
Some things to watch out for when you buy an NAS device:
There are various RAID arrangements, the two most popular of which are RAID-1 (mirroring) and RAID-5 (striping with parity). The idea is that, if a disk fails, it can be removed from the running system and replaced without the system going down or any data being lost. When you replace the defective disk, the RAID system automatically restores, over a period of hours, the data that was on it. (So-called RAID-0, also called striping, isn't really RAID at all because there's no redundancy. The loss of either disk destroys all the data on both disks. RAID-0 is for performance, not for reliability, which is actually reduced.)
RAID is OK if you're using it to replace what otherwise would be a single drive (primary or backup), but it doesn't substitute for backup. Moreover, if your budget is limited, the extra money you spend on RAID might mean that you have less to spend on backup.
To see why RAID doesn't diminish the need for backup, we can do a quick threat analysis, as we do for all other backup methods:
Anyone who thinks RAID provides sufficient protection is focused too narrowly on a single kind of failure, the disk itself, and is ignoring the other threats.
However, many backup programs provide a way to encrypt the backup data, and that's what you should do. If your backup utility doesn't provide encryption, you may be able to encrypt the data anyway by encrypting the volume the data is written to. On a Mac, you can use Disk Utility to create an encrypted disk image, as I explain here. (8-Feb-2008 update) I'm not sure the method in that article is practical. I've updated the article to explain why.
A much better choice is Super Duper.
I've used Super Duper to make both bootable disk images and partial backups for carrying offsite, but haven't been able to for a few months because
it doesn't work with Leopard yet.
(It does now.)
I also used iBackup before I got Super Duper, and it seemed to work fine, although it won't back up a complete system like Super Duper will.
I've tried Retrospect for the Mac, which interested me because it's one of the few backup program that can write CDs/DVDs, but it repeatedly hung trying to write a DVD.
I've just written my own utility to copy folders and files to multiple CDs and DVDs, SpanBurner ($10).
Vista comes with four backup systems:
(31-Jan-2008 update) I don't think I was right about Virtual PC; mounting a VHD isn't something it's supposed to do, near as I can tell. But, I did install VHDmount by downloading Virtual Server (free), choosing to custom-install it, selecting only VHDmount to be installed, and then running VHDmount on the VHD file written by Complete PC Backup. Worked perfectly. So, if you're willing to install and use VHDmount (you don't have to use Virtual Server, or even install the whole thing), there is a way to verify that your backup was written and to restore individual files.
Complete PC Backup doesn't have a built-in scheduler, but you can schedule it from the Task Scheduler, which is on the Administrative Tools menu. Choose Create Basic Task, enter a name and set the time, enter "wbadmin" for program, and "start backup -backupTarget:F: -include:C: -quiet" in the arguments field, assuming you want to back up drive C to drive F. On the next panel check "Open the Properties dialog for this task when I click Finish" and, when it opens, check "Run with highest privileges".
To my way of thinking, even though Complete PC Backup seemed to work (I never tried a restore), that ordinary users (who don't install VHDmount) have no way to check its output is a serious defect. Also, and probably for the same reason, you can't restore individual files; you can only use the VHD file, all of it, during a fresh install. So, it fails my Minimum Requirements for Backup Software (see above), unless you install VHDmount.
So, Back Up Files fails my Minimum Requirements for Backup Software. (I still use it because my computer happens to have four mostly unused drives inside it, but I don't count it as part of my backup plan.)
With the more advanced versions of Vista, you can run Back Up Files automatically according to a schedule you set.
Previous Versions provides substantial protection against User Error if the good stuff was from yesterday, since you only get one version per day. If it was last week, you may not have it. Contrast this with Time Machine, which keeps hourly backups for a day, daily backups for a month, and weekly backups forever, space allowing. All on an external drive. And, it's the entire system, not just user files that have changed. And, you can restore an entire system from the Time Machine backup.
So, while Vista has four backup systems, even all of them used together don't provide protection from the six threats. Each of them has serious flaws:
All of the Vista back up systems use the Virtual Storage System (VSS) to ensure that only internally-consistent files are backed up, even if an application has them open. OS X doesn't have anything like that; their thinking is that Time Machine will get the file next time. None of my disparaging remarks are intended to disparage VSS, which is an excellent piece of work.
I have heard that XP lacks most of the four backup systems in Vista, but that there is a backup program, not installed by default, that works better than Back Up Files, in that you can at least tell it what you want backed up.
For years I used Retrospect on Windows, and it ran reliably. I can't say for sure that it would have restored my entire system, because I never tried it. When I spot-checked for files on the backup, they were always there.
A widely-used Apple online backup service comes with .Mac. You use the Backup utility, which I mentioned above.
Unfortunately, in two years of trying I have never gotten Backup to work reliably with .Mac, so I don't recommend it. The problem is that when I set it up it runs OK for a few days, then quits with absolutely no notice whatsoever (other than an obscure log entry in a place I never look), and I have no idea how to get it started again. Most recently, with over 20GB free on my .Mac account, it quit claiming it was out of space. As I mentioned above, I haven't gotten it to back up to DVDs, either.
.Mac also comes with a virtual online disk, called iDisk, which you can read and write like an ordinary disk, only much slower. I have had some success dragging things to iDisk with the Finder. I've also used Transmit to create a droplet that sits on my dock. Any files or folders I drag from the Finder to the droplet go off to iDisk, which is something I do every hour or so when I'm working on an important document.
The online backup service I like the best is Amazon S3. Amazon has designed it to be extremely reliable and available, it's widely supported, and it's very cheap. For example, .Mac costs about $6 per GB per month, but S3 is only 15 cents per GB per month, or 13 times cheaper. S3 also charges a transfer fee of 10 cents per GB in, and a bit more coming out; transfer in is what you'd use for backup. If you transfer 50 GB a month, which is a huge amount to transfer online, that's still only $60 per year. (At DSL speeds it would take about a week to transfer 50GB, assuming the line is doing nothing else and your computer stays up for a week.)
Also, as far as I know, .Mac works only with OS X, whereas S3 works with any system, so you can use it to share files across platforms.
I use S3 in three ways:
There are numerous online backup systems for Windows, but probably none are as cheap as S3.
As I've said, S3, or any online storage, is really convenient for small files. For large amounts of data, it's still a good long-term solution, but it takes a lot of time to upload the data. It's worth it, however, for important data that doesn't change. I have thousands of photos on S3 that are now backed up by Amazon in addition to the backup I keep myself (hard drives and DVDs). But it was very troublesome to get them uploaded. Uploads I start generally freeze or quit after about a day. Then I have to see what got transferred and start another upload. It's taken weeks, with many stops and starts, to get the photos uploaded, and I have many gigabytes of new photos that aren't yet uploaded. An ongoing project. It's possible that Jungle Disk's built-in backup utility would do a better job, but I haven't tested it yet for my photo data.
One of the best things about keeping an external drive permanently connected and switched on is that backup to it can be completely automatic. For that to work, the backup software has to run automatically as well. Of course, your computer has to be on for the automatic backup to run, which generally means leaving it on all the time.
For a laptop, you'll have to connect it when you want to back it up and then disconnect it. Backups won't be entirely automatic.
If all your data is on your computer's internal disk, it's best to get an external drive that's the same size or bigger and just copy your entire disk to it each night, or, if your have a Mac running OS X 10.5 (Leopard), use Time Machine, which does copy all the data to the drive, but in the form of versions. (It's straightforward but slow to recreate a crashed internal disk from Time Machine. You boot from the Leopard DVD. Here's an article about how to do it.)
If you have several active disks, there are two cases:
On a Mac, if you don't have Leopard you should use Super Duper to clone the internal disk.
On Windows, there are lots of choices for cloning the internal disks, including Complete PC Backup, which comes with Business, Ultimate, and Enterprise editions of Vista. (But see above for its limitations.)
Other choices for backing up your entire Windows system are Norton Ghost, Retrospect, and Acronis True Image none of which I have tested. (I did use Retrospect for years, but never had to do a restore.)
(31-Jan-2008 update) I've now tried Acronis True Image Home ($37 or so from Amazon), and it seems like a fine program that works well. It can image your main drive, so you can boot directly from the backup, and also back up just the folders you want. It's a much better choice than the backups built into Vista (Complete PC Backup and Back Up Files) because it allows you to verify the backup and restore individual files (without installing VHDmount, which is a pain), and, unlike Back Up Files, it allows you to control exactly what's backed up. It can even send you an email when its finished. There's a 15-day free trial.
(25-July-2016 update) I don't like Acronis True Image anymore. For why, see the blog article referenced at the top of this article.
Most complete system utilities refresh the backup by only replacing files that have changed. But you don't get to keep previous versions; all you have is the most current complete backup. Also, if the backup fails, you may have nothing, which might mean that you are not backed up at all, which is an unacceptable situation, even for a minute. A practical solution to backup failure, since it's pretty rare, is to put in into the Office Destruction category, which I will discuss shortly. In other words, if the backup fails and the primary disk also fails, you will suffer the pain of recovering from Office Destruction. If that's too much, then take the external backup drive offline once a week (or every other day, or whatever) and replace it with a fresh one, perhaps rotating the drives so you don't have to keep buying new ones.
If surge is very common in your area (either from power-utility problems or from lightning), you may want to keep a weekly complete backup drive offline, just as you would to protect against both the nightly automatic backup and the main computer failing simultaneously. But, if surge is rare, and especially if you're using a surge protector, you can simply consider surge to be Office Destruction and deal with it that way (next section).
The actual backup hardware and software for offsite backup isn't any different from what you use to protect against User Error, Computer Failure, and Disappearance (see above). The difference is the independence you gain by moving the media or device offsite. The problem is that this requires extra work and isn't automated, so it may not happen. It doesn't do any good if an office fire destroys the drive that's been next to the door waiting for somebody to take it away.
I use two drives: One is offsite (at a friend's house, about 20 miles away), and the other is in my office. About once a week I write a new backup to the drive that's at hand, and then I take it with me when I see my friend and swap drives. That way I have at worst a one-week-old backup 20 miles away. (Of course, a one-hour-old backup is online, hidden underneath my desk, but that's to deal with a different group of threats, as I explained above.)
I'm the least qualified person in the world to give anyone advice on how to develop good organizational habits, but here's some advice anyway. These are things I actually manage to do.
For active work, I don't want to risk losing a week of data. I keep a USB flash drive near my computer (even in the PC Card slot on my laptop), and write to it as I work. I burn CDs and DVDs and put them in my media safe, or take them to my friend's house. Remember, as long as the media or device is small and not plugged in, you're protected against User Error, Computer Failure, Disappearance, and Surge.
Various application programs I use, such as BBEdit and Lightroom, also keep their own backups, which they write to an external drive.
As I mentioned, when I travel my laptop usually stays connected to my arm or shoulder.
Also, my photographs (most of which are on the hard drives also) are backed up onto Amazon S3 and onto DVDs.