On recovering from disk failure
July 28, 2005
Crash number two happened under peculiar circumstances (kids playing games in OS9 compatibility mode, plus behaving badly) so we assumed it was just software. I had a day-old backup, so we just reformatted and carried on. However, I made a crucial mistake when I reformatted — I did not take the time to “write zeros” to the new drive. It turns out that this is a good thing to do, because it allows the formatter to detect bad sectors and replace (remap) them with spares.
Crash number three happened a month later. Now that the dust is clearing, and after much web-searching and data recovery, I managed to figure out roughly what happened. A very small number of blocks on a 250GB drive were bad; I think about 1 block in 5,000,000, and they appeared to be clustered. That means that if the initial restore did not use any of those bad blocks, the machine would appear to work just great, and it would continue to work great as long as nobody wrote vital data to one of those blocks. One in 5 million is pretty good odds, too, so it’s no surprise that things looked good for a while.
Until, of course, we got unlucky. Our backup was several days old, and we had unloaded some birthday party pictures, so we really didn’t want to just scratch it. I tried DiskWarrior first, but after about 15 hours (actually, more like 25, but a power failure set me back) I decided to give Data Rescue a try. Data Rescue did not repair the disk, but it did obtain the valuable data for me. My understanding of the problem is that DiskWarrior would eventually have succeeded, but it gave no feedback about its progress. Data Rescue had a “thorough” option that I tried first; however, after hitting a few bad blocks it started estimating completion times in the 12,000 minute range (i.e., 200 hours) so I tried a quick scan that took about 30 minutes, and found all the missing files.
Right now, I am reformatting and writing zeros, in hopes that the result will be a fully functional disk. If it weren’t for the hassle, I would have played the AppleCare card and just gotten a replacement drive.