This is the blog archive site. For the latest blog articles, click here.

Lessons Learned From My Memory Problem

December 20, 2007

This is Part 2 of two entries about a recent memory problem with my Mac. Click here for Part 1.

Lesson 1: The worst thing a computer can do is give the wrong answers

Not really a lesson, since I've known it for a long time, but it's worth emphasizing. We all get annoyed when a computer shows an error alert, especially the so-called "blue screen of death". (Macs use a much prettier little black box, but it still means death.) We don't like it when the computer hangs, completely unresponsive. And, we really hate it when the computer just goes dark. But none of these are even close to the worst thing a computer can do: Give the wrong answer.

Suppose you're in the hospital with a seriously wounded leg and the surgeon has come into your room to give you the prognosis. Would you rather hear "Sorry, computer problem, I'll come back tomorrow" or "Sorry, the leg has to come off."

Or, suppose you called 911, and the fire department got the wrong address? Or the Dept. of Defense confused you with someone else and put you in Guantanamo for the rest of your life?

Lesson 2: ImageVerifier is really good, and very important.

The rigorous test performed by ImageVerifier, calculating a 128-character hash for each of thousands of files, was the only indication that I had a bad memory module. It's for certain that some of the random and very infrequent software crashes I'd had over the past couple of years had the same cause, but I'll never know, since crashes are common even with perfect hardware. Usually, a random crash isn't reproducible, so there's no way to really know what the cause was. Not so with ImageVerifier: It failed every time, although not always with the same files.

Also, I now know that copies of image files (or any files) can be wrong even without the computer complaining, so having a hash check is all that keeps the images from being lost.

Lesson 3: A computer may not tell you when it fails.

Many people are under the impression that, as annoying as hardware problems are, at least you'll know when they happen. It's just not true, as my recent memory problem proves. I knew there was a problem because of my hashing tests, but all the while the computer went along behaving normally, just as it had since the day I bought it over two years ago.

Lesson 4: All computers used for critical work should have ECC memory.

ECC stands for Error Correcting Code. It's a memory design that uses extra bits to determine if a bit has failed and, in most cases, to correct it. ECC memory costs extra and very few desktop systems have it. In the Apple line, only MacPro and server systems (separate CPU box and monitor, like most Windows PCs) have it. Laptops and iMacs don't. Most Windows desktops and laptops don't have ECC memory either, but at least the desktops that do that are much cheaper than Apple's top-of-the-line MacPro.

If you Google "ECC memory", you'll find lots of articles that say it's not necessary for desktops and laptops, but only for servers. This is just self-serving nonsense, because the computer companies want to be able to compete on price. The theory is that desktops and laptops are used for consumer, personal work, whereas servers are used for important things. It's easy to come up with counter examples: Servers that are used for fantasy football and YouTube videos, and desktops that are used for medical records and dispatching emergency services.

How much more does ECC memory cost? A lot more, maybe 50% or so, for two reasons: More chips and electronic complexity are involved, and the demand is much lower than for non-ECC memory. Not that you have the choice, since few desktop systems can take ECC memory or do anything special with them if it's installed.

Where does that leave me?

My iMac G5 is fixed now, but it's near the end of its useful life, or at least it is the way I use it. It can't be upgraded beyond 2GB and it has only one CPU (one core), so it can barely keep up with the image processing I do. I'm thinking of replacing it sometime in the next 6 months or so. But now, I don't think I want another iMac, even one with two cores and 4GB. No ECC. That means getting a MacPro, which would cost at least $1000 more (but with four cores instead of two).

On the other hand, there's a PC running Vista about two feet away with 2 dual-core Opterons and 2GB of ECC memory (upgradable to 16GB). I use it only for working on the Windows versions of ImageIngester and ImageVerifier. Maybe I should switch back to Windows?

I probably won't. When it's time to upgrade, I guess I'll just go to a MacPro.

Should cheap desktops be used for critical work?

Never mind me—what about all those desktops out there being used in medicine, emergency services, accounting, and lots of other things all more important than what I do? Do the people who buy and deploy these systems know that the memory has no error correction, or even detection, and that failures can result in wrong answers? I'm sure they don't. Everyone just assumes that computers work until they fail, and that failures comes with an error dialog, a blue screen, or an unresponsive system.

(The need for reliability isn't limited to complex calculations. As my own experience with ImageVerifier showed, even storing or printing a result can be erroneous.)

Critical computing ought to take place only on the most reliable machines, and, since MacPros are so much more expensive than more ordinary Macs, that probably means that Macs are going to be too expensive. But, most people use Windows machines in business, and for those the price premium is a lot less.

For example, Dell has a desktop with a 2.4GHz Intel 2 Core Duo, 2GB of ECC memory, a 250GB drive, a 19 in. monitor, and Windows Vista for $1600. The cheapest MacPro with 2GB of ECC memory and a 250GB drive is $3100, but it has 4 cores, which is the fewest a MacPro can have. A 2GB MacPro with such a small drive is an odd beast, and any MacPro is overkill for typical use in a office, even one that does critical work. While the comparison isn't exact, it's clear that Apple doesn't offer any suitable office system with ECC memory. Dell and lots of other vendors do.

I'm not saying that offices shouldn't use Macs, only that Apple doesn't think that that market needs ECC memory. I don't agree.

The people who buy office computers for critical use don't want to pay extra for ECC memory (if they even know that they're not getting it with what they do buy and what that means, which I doubt). I don't agree with that, either.

The bottom line is that people complain all the time about how unreliable computers are, there's been a technical solution to at least part of the problem—a serious part—for a long time, but nobody's using it. That's wrong.

Blog Archives

Photography Articles

Raw Conversion: Better Never Than Late April 24, 2008

Scanning in India by Way of California With ScanCafe February 15, 2008

How To Back Up Your Personal Computer January 30, 2008

Every Camera I've Ever Owned January 25, 2008

Sharpening JPEGs for the Web January 4, 2008

Lessons Learned From My Memory Problem December 20, 2007

Hunting Down a Mac Hardware Problem December 20, 2007

Trimming GPS Tracks With GPSTrackViewer November 13, 2007

The World's Shortest Camera Buying Guide September 22, 2007

Transporting and Storing Portable Backup Drives August 26, 2007

"The Luminous Landscape" Teaches Me to Print August 4, 2007

Creating a Google Photo Map (Revised) June 26, 2007

Sony GPS-CS1: Not Good Enough for Geotagging Photos June 24, 2007

Epson P-3000/P-5000 Multimedia Storage Viewer March 10, 2007

Trying Out Infrared January 20, 2007

Stupid Designs Hold Digital Back April 1, 2006

 

Other, older articles


Galleries

image

A small collection of my best photos (click the image). You can order prints, too.


Software

image ImageIngester
image ImageVerifier
image LRViewer
image LRVmaker
image PhotoSelectLink™
image ImageReporter
image SpanBurner
image GPSTrackViewer

Books

The 2004 2nd Edition, a so-called "update" of the 1985 book, which turned out, not surprisingly, to be a re-write. Covers Solaris, Linux, FreeBSD, and Darwin (Mac OS X).


Entire contents of this web site Copyright 2006-2008 by Marc Rochkind. All rights reserved.