Disaster recovery


Introduction

I decided to write this paper based on a couple of scary experiences I've had over my professional life. It seems very strange to me how many companies take an almost flippant approach to planning for disasters.
Consider September 11th or Hurricana Katrina or Superstorm Sandy. If such an event struck you data-center would your company be able to recover?

Understanding the value of your engineering data

All data in a company is important and it is critical to back it up and make sure you can recover from unplanned issues; but consider the loss of your engineering data. In many companies there is no company without this data! It still surprises me that senior people in companies don't realize that without the intellectual property (IP) of the engineering drawings, bill of materials, NC toolpaths, material specs, process sheets etc. the company could quickly go out of business.
So that is why people invest in PLM systems etc. right? Of course, but such a system is only a facet of insuring that your company's IP is safe. There are many other areas of backup and restoring that need to be in place to insure that the company can continue to run after a catastrophic event.

Upgrades and updates of software

I add this section in to remind people that it is not just about backing up data, there is more to disaster recovery. I have seen many companies who have a significant quantity of engineering data in old systems. So here is a first rule
Insure that all you important data is in a “supported” system and format.
“Supported” system insures that if there is a problem it is likely that your vendor will be able to help you in some way. An unsupported legacy system that the vendor went out of business 10 years ago will mean your data is locked away somewhere never to be seen again. If you have a system that is out of support I'd suggest migrating the data or at least exporting out the data so that you can just archive the data in some way e.g. zip files etc. Although not the best approach it does allow some access to the data. Information stored in a proprietary database would be lost along with the vendor!
Note – some defense companies insist that data remains in a system for X years so in some cases moving to new software and/or releases might not be feasible.
I'm also a strong believer in keeping to a fairly recent release/version of the vendor's CAD and PLM tools – no Windows 3.1 please!

Planning - when to start

Planning disaster recovery should be a consideration early in any PLM project, not an afterthought. Engineering data poses some unique challenges so backup and recovery are very important. Also hardware for backups etc should be part of the initial budgetary considerations for any PLM implementation.

What to backup

Well everything of course! But what does this mean – do you really care about the kitchen design that Fred designed in CATIA in his spare time and it is in his home folder or test data that was in a sandbox.
Having said that you may have specific QA data that is not on a production machine to exercise the release to ERP processes which could be really vital, so it is key to document what data should be stored and backed up.
Remember that it is not just the data you need to backup. If you system has customizations and configuration files (in PLM and CAD) you will need those to be saved too. Sometimes this can be difficult to assess. This is where your validation and testing can help.
Also for legacy systems it can be useful to archive the CDs for the install of the operating system and also the applications. I've found generating virtualized copies of old operating systems can be helpful too.

Archiving

Remember that some data could be archived. You need to insure if you archive data from your production system e.g. an obsolete product that it can be recovered. You may need to get the data back at some stage e.g. for legal issues and you should be able to view all necessary data.

Validation and testing

Once you have developed a backup strategy it is imperative to validate your backup plans and test the backup data. The test should be as real as possible. For example a full disaster recovery could mean running on backups and also on backup hardware for a day to insure that all processes and workflows are exercised.
I have seen some customers who backed up Oracle incorrectly so the backups they had were useless until we changed the way they did their cold backups!

Backup versus Disaster Recovery

Some quick definitions here.
Backups are for recovering data that was lost for a number of reasons e.g. a hard disk crash or someone inadvertently deleted something that needs to be recovered.
Disaster recovery could include a fire in the data-center, an earthquake destroys the building or some kind of terrorist action. In this case you need to have plans for not only data backups, but backup hardware and even possibly a backup building (this might be excessive of course!).

What to do with backups?

OK so you have a nice cloud or tape system to back everything up and even backup hardware. What if the building is destroyed, so will your backups. You can buy fire-proof safes and there are alternatives such as off-site storage options such as IronMountain. If you use a cloud solution you should insure you know how they offsite their backups. Due diligence is always better than just making assumptions that they know what they are doing!

Security concerns

Finally insure that backups and DR plans are secure. You want to insure that the backups are accessible, but only to those who need them, not your competitors!

So in conclusion

1. Develop backup and disaster plans
2. Make sure you consider all facets of your intellectual property that makes up your product.
3. Insure you can access legacy data.
4. Keep applications up to date!
5. Test backup and disaster plans regularly and update as needed.
6. Make tests as real as possible – don't burn down the building as a test – PLEASE...

7. If all else fails you could send your crashed disk to a data recovery company – I've had a customer have to do that since they had no backups! It is not pretty and very expensive!

Comments

Popular posts from this blog

New to PLM – PLM Basics (Part No, Workflow/Lifecycle, Attributes)

PLM PDM book

Small and Medium Sized Business PLM