Tuesday, November 18, 2008

Computer Backups @ Home

After trying lots of different backup strategies for my home computers, I’ve finally settled on a backup system that is fault, fire and theft tolerant. With a convenient combination of live recovery and offsite backup, I can easily recover from single hard drive failures, and even a complete system loss.

The trick is that I’m using a RAID 1 mirror approach with an extra drive. In this scenario, I’ve got two drives working as one that tolerates the complete loss of one of the drives, and the system continues to operate without interruption. Each week, I power down the systems and remove one of the drives (which is a perfect copy of the other). I then install a drive that I had kept offsite in a safe deposit box at my bank. During the boot process I tell the machine to rebuild the RAID 1 array and overwrite the newly inserted drive. The RAID rebuild doesn’t take much processing power, and a few hours later the machine is back to full redundancy.

At any one time, I have 2 drives in a machine acting as fault tolerance, and a third drive at the bank acting as standby for a complete and total loss disaster. I’ve experimented with my motherboard built in ICH8R hardware assisted RAID controller, and I know that I can easily rebuild from only the offsite drive.

To make the process better, I put mobile drive racks into my 5.25 bays. The rack accepts direct insertion of SATA drives (no sleds needed). I don’t even have to disassemble anything to make the drive swap. These days 500GB drives are cheap, which makes this scheme very affordable.

KINGWIN KF-1000-BK 3.5" Internal hot swap rack ($25)
http://www.newegg.com/Product/Product.aspx?Item=N82E16817990001


OEM 500GB drives ($60-$70)

http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=2010150014%20103530113&bop=And&Order=PRICE


This is arguably not a backup strategy. Considering that I have fault tolerance AND an offsite copy, this simple system satisfies all my needs. The offsite “backup” is about a week old at any one point, but I would suggest that it’s more current, more reliable, more viable and more convenient than most people’s backup strategies. Considering that most people backup by taking hours copying files to an external drive, my backup is basically just pulling the drive out of the machine.

Also consider that my RAID 1 mirror strategy can be used like a Virtual Machine rollback feature. If I wanted to, remove one of the drives, and try arbitrary system changes and software installs. If I didn’t like the result, I could remove the remaining drive, and reinsert the one I had pre-removed. At that point the system would be in pre-experimentation mode. The system would need to rebuild the array, but a small price to pay for having rollback capabilities on a physical machine.

There are some issues/annoyances to this approach. First, when a drive is failing, typical consumer drives make “heroic” attempts to prevent data loss. This means it can take extremely long amounts of time retrying and remapping bad sectors (hard drives are S.M.A.R.T. you know :-) I’ve seen this occur on my systems, and it takes a while to realize why the system is semi-freezing, acting erratic, or possibly not shutting down properly. Once I recognize it, I shut down and run my vendor supplied drive diagnostics to find (and possibly correct) the bad drive. I don’t like any drive problems, but these days the diagnostic software will correct them enough that a warrantee exchange can’t be done. The other annoyance is having to shutdown my systems once a week, and drive to the bank. The systems aren’t down for long and it is a short drive to the bank, but it is slightly annoying. If you wanted to attempt this scheme, you would have to RAID enable your system in some way. Maybe buying a new motherboard, or a PCI card, or external NAS RAID… I can see that being annoying too if you don’t have it already.

With my frequent visits to my safe deposit box in the vault, and bringing my precious data (photo collection, word docs, personal projects, etc…), I feel like it would be cool to have one of those steel secret-agent brief cases handcuffed to my wrist :-)

Note:
I could use an online Internet cloud-based backup solution, but my Honda Civic can transfer 1.5TB of data to/from offsite in about 20 minutes. My cable modem at home can't achieve that type of bandwidth every week :-)