Probabilities and Disaster Recovery

The risk of an event is sometimes inversely proportional to its impact.

When I talk about disaster recovery, one of the key things that I try to stress is the idea that the amount of effort and resources you devote to the problem is often scaled to the risk of loss. It doesn’t help the company if you decide to spend $1mm to ensure extremely high availability and zero data loss for a system that generates $20k in revenue a year. It might not be worth spending $100k to protect a system from more than a day of downtime if your daily revenue is less than $5,000, but it might be. You have to decide this in your environment.

However when we talk about disaster recovery is seems that the vast majority of people plan for a data center failure, or a hurricane, or a major disaster. Those are possibilities, but the risk of them happening is low. It’s rare that a major disaster will hit any particular part of the world and therefore the risk is often very low. With a low risk, it might not be worth spending a lot of money on extra hardware to handle a situation that may never occur. I know that most of the time the management I’ve worked for haven’t felt it was worth spending a lot of money to prepare for a major disaster. There are systems that are worth duplicating to ensure high availability, and in many cases it does seem that management is willing to pay for spare systems when downtime is an issue.

The most common disaster that I seem to hear about is the “whoops” disaster. A disaster that’s human error, a situation where someone makes a mistake in data entry. The most common “whoops” for DBAs seems to be the UPDATE or DELETE without a WHERE clause, but it seems these days there’s no shortage of issues occurring from applications that allow users to manipulate large batches of information.Yet it seems that too often I don’t see management making the preparation for this type of disaster a priority.

There are numerous ways to handle these types of disasters. You can set up log shipping on a delay, even to a secure workstation (remember to secure production data), to give you time to respond to a situation and recover data. There are numerous tools, such as Red Gate’s Virtual Restore and SQL Backup Pro, that allow you to mount a backup file or recovery a single object from a backup file without impacting the full database. There are log reader tools that allow you to recover data or undo transactions from the transaction log whenever issues arise. Many of these tools have a price, but the cost of downtime is usually higher. Even the cost of losing other work when highly paid professionals are spending time recovering data across hours instead of minutes would justify the cost of purchasing one of these tools instead of the cost of time spent building a data recovery solution.

Whether you build or buy a tool to help you deal with a “whoops” disaster, you ought to ensure that you make some preparations here. The risk of you losing revenue from this type of disaster is much higher than spending weeks building a plan that provides for recovery in the event of a catastrophic disaster.

The Voice of the DBA Podcasts

About way0utwest

Editor, SQLServerCentral
This entry was posted in Editorial and tagged . Bookmark the permalink.