"POP"
Not the sound you want to hear from your Drobo NAS unit! Especially after a full days work.
"Disaster" has many levels but if you're a professional in any line of business that involves storing masses of digital data and you don't have even a minimal disaster recovery plan then when your luck runs out (and it will!) you're going to be in trouble!
I've seen companies reduced to administration procedures because they had no backup plan or disaster recovery plan. They just could not recover without their accounts system, their client data, their job register.
Maybe you don't care but , trust me, your clients will. Do not underestimate the amount of time and effort that's going to be required to get you back up and running if you lose your data.
[caption id="attachment_1306" align="aligncenter" width="300"] Drobo Stack[/caption]
Overview ...
To understand how our plan evolved you need a quick overview of why our system is the way it is.
We're photographers and retouchers. We keep masses of our own digital images for whatever use (prints, leasing, etc.). We also keep masses of client images for retouching (they probably don't have a backup).
When we first set up we needed a common network area for shared data storage, a full blown server was too much hassle and admin, enter the Drobo. Setup as a RAID5 array for data security and resilience. Note, RAID5 is not great for recovery, if the unit itself fails you cannot access the data on the disks but, as a first line of defence, amongst a larger plan it is fine.
There wasn't a job system available with the facilities we required so we created one based on MS Access. As former software developers it made sense (yeah, we know SQL is better but it's way overkill for our needs and we don't have a lot of "admin time"). It has check in/out facilities for moving jobs to/from the Drobo to a workstation. It is also separated into front-end and back-end, the back-end being on the Drobo.
For legal compliance, all of our emails are archived, indexed and searchable from the Drobo.
We run a digital accounts system, backed up to the Drobo.
Other, trivial, stuff happens around the Drobo but the previously described data we would describe as "a problem" if it went missing.
OK, you get the picture, we do a lot of network based activity dependent on the Drobo.
Our Current System ...
Our primary NAS, the Drobo, is our day-to-day working NAS. When we need to work on a customer job we check-out the job, via the job system, to our local workstation, work on it, the check it back in. (nutshell)
Daily, our workstations backup personal files and other workstation-based files to the Drobo (accounts, etc). Incremental backups Mon-Thur, Full backup Fri.
Weekly, the website is downloaded to the Drobo.
Daily, the Drobo is mirrored to a secondary NAS unit setup as a RAID1 array.
Weekly, the secondary NAS is backed up to removable HDD's.
Monthly, removable HDD's stored off site.
This may seem heavy handed and over-engineered but, believe me, at one point or another each part of this setup has had to be pressed into use in anger. Think "multiple redundancy", nothing is infallible.
Disaster Recovery ...
Our basic strategy ...
1) Any workstation fails - the work and necessary files are on the Drobo to be accessed from another machine to allow business to continue until the failed machine is replaced/fixed.
2) A Drobo drive fails - because of RAID5, the unit will rebuild and maintain data access until the failed drive has been replaced.
3) Drobo unit fails - the work and business critical data are mirrored to the secondary NAS unit. Remove the Drobo and setup the secondary as primary and carry on until the Drobo is replaced/fixed.
4) Secondary NAS unit fails - replace it quickly, not an immediate problem but needs to be addressed.
5) Drobo & Secondary fail - because of RAID1 setup on secondary there are two copies of the data. Remove either drive, put into workstation, setup workstation as primary network data source until system repaired.
6) Drobo fail, secondary fail, single secondary drive fail - there is a mirrired copy of the data on the other drive that is part of the RAID1 pair.
7) Drobo full fail, secondary full fail - there is an offsite copy.
We recently suffered a Drobo power supply failure and point (2) saved us!
We're not experts at disaster recovery, nor is it our primary concern but previous careers enabled us to see the critical importance in a business environment.
There are other ways to do it, other mechanisms, other systems but this is how we do it. It works. It's been tested.
Don't wait until your systems fail to, at least, set up something minimal (e.g. backup to external drive)
No comments:
Post a Comment