Disaster Recovery: The 5 Things That Often Go Wrong
When disaster happens, it’s best to be prepared. Appropriate planning, investing and testing can help minimize downtime.
There are plenty of well-worn clichés you can apply to the subject of disaster recovery: "Failing to plan is planning to fail," is one and "the devil is in the detail" is another.
Why all the trouble? Because it can sometimes prove difficult to engage all the relevant stakeholders in disaster recovery planning. Essentially, DR planning is an insurance policy, and on a day-to-day basis people tend to find more pressing issues to concentrate on. This is a shame, as failing to adequately bounce back from a major disaster can sometimes prove to be a death sentence for a company.
In this article we look at the five things that tend to go wrong with the planning and execution of disaster recovery.
Failing to Plan
In an age of terror attacks and frequent natural disasters, we can only hope that most companies have, at the very least, the bare bones of a disaster recovery plan in place. Even then, it’s not really the bare bones that make up a successful, workable plan.
This is where "the devil in the detail" comes in, as DR plans need very careful thought. It’s no good having a plan that’s only 95 percent successful on execution, especially if the failed 5 percent involves email bouncing off a melted server and customer phone lines being unsuccessfully redirected.
Failing to Invest
This point brings us back to the "insurance policy" nature of disaster recovery planning. Companies obviously don’t like the thought of buying stacks of redundant equipment that will only get used in the event of a disaster, but effective business continuity doesn’t come for free.
The key is to be smart with purchasing decisions. Could old hardware be re-used? Is virtualization an option? Is it worth moving to image-based online backup because of the DR benefits this brings?
Putting an effective DR plan in place needn’t cost the earth, but nobody should expect it to cost nothing either.
Failing to Test
This is the most important point of all: No company should assume that their DR plan will work unless they test it.
There’s no getting away from the fact that DR tests are expensive and time consuming. This is probably why a frightening number of companies never get around to completing them, or do them in a half-hearted way.
This is not a good strategy. Only a full simulation will expose the shortcomings in the plan. What’s the point of only performing a test restore of a single server when another server hosts, for example, a database that’s crucial to the running of the company?
Performing a full simulation test is a big job, so make time for it, accept the cost and involve staff in the process. This then provides the added benefit of engaging the staff team in the importance of DR. Best of all, once you know the plan works, everyone will be able to rest easier at night.
No Hardware to Recover To
Hardware is expensive. We all know that. But what’s the point of preparing for a disaster recovery if you don’t have a server to hold your data?
Another aspect to think about is hardware failures. Sure, your server might be taken down by a virus, but what happens if the motherboard fails? A complete backup of the OS and data isn’t going to help in this situation.
Buy bare metal servers in multiples of the same model. In this way, you not only have the same identical server to restore your data, you also have a store of backup parts in case something goes wrong on your main server.
Offsite Backups Really Are Offsite
The idea behind offsite backups is simple: If a major disaster hits your servers, such as a fire or flood, you have a full backup at a separate location. Sparing a global nuclear attack, this is probably sufficient for most companies. However, if your offsite backup is miles away, do you know exactly how long it will take to retrieve your physical media? What about if disaster strikes during rush hour? Online backups can help mitigate this, but what about the impact of bandwidth on your network? (To learn more, check out Cloud vs. Local Backup: Which Do You Need?)
Do some test runs to your data site and see how long it takes for a typical recovery session. If downtime is too long, consider moving your backups to a site that’s closer to the servers.
Plan, Invest, Test
No one wants to have to recover from a disaster, but when it happens, it’s best to be prepared. Planning, investing and testing are necessary if you want to minimize downtime.