The recovery is the best part of a disaster

07/08/2019

by Kyle Berry

I know, I know. An article on disaster recovery and how important it is. *yawn* The necessity of DR is a foregone conclusion. I thought that, too. However, I am amazed at how often I still see companies without a defined disaster recovery plan, much less the technical readiness.

The things that keep companies from establishing disaster recovery capabilities progress through the standard gamut of excuses, including but not limited to

Cost constraints
Resource constraints
Low probability of risk

Often, the reason is that companies have grown rapidly, and that same entrepreneurial spirit that keeps the business moving forward does not want to take on a project that focuses on risk reduction. The drive is to focus instead on projects that improve operational excellence and business differentiation.

But what happens to that push to business excellence if disaster indeed strikes? What is the cost per hour if manufacturing stops, if customer orders cannot be received, if field activities halt, if customers cannot sign up for services? Disasters may have a low probability, but they can have catastrophic impacts.

Disasters

Now, what constitutes a “disaster”?

We usually think of the data center engulfed by fire or wiped out by a tornado. In reality, you may lose connectivity to your data and applications for significant time via someone simply hitting a nearby power pole. I’ve seen multiple companies lately with critical applications housed in office server rooms (aka closets) with supplemental cooling. No kidding, one had a window unit AC! If the cooling unit fails, the equipment in that room will overheat and shut down. Other issues may be system patches that corrupt databases or ransomware that locks file systems. These scenarios are scary, but they can often be overcome with a valid DR plan in place.

Create your plan

So, where do you start?

Tier your applications according to business criticality, and document who does what and in what sequence.
Plan what hardware you need and where you will put it if your plan is to purchase new hardware and restore from tape (yes, purchase and replace is a valid plan if those applications can be offline for an extended time).
Determine your Recovery Time Objectives (RTO), how long you can afford the application to be down, and your Recovery Point Objectives (RPO), how much data you can afford to lose.

With many companies taking advantage of Cloud technology, DR planning has become much more streamlined. However, you must be intentional about your disaster recovery planning – even if you utilize the Cloud. Just because you are in the Cloud does not make you bulletproof, which can be a hard lesson for folks to learn. Even if you migrate to the Cloud, you still need to replicate data and virtual machines, which is typically an additional cost. However, most Cloud providers can configure and manage that replication, and they will often have SLAs that guarantee your uptime.

See if it works

Once you have your plan in place, be sure to test it.

Coordinate with your remote data centers, cloud providers, and most importantly, your business stakeholders.
Validate that data replication is working and that recent production data is present in the DR environment in accordance with your RPO.
Make sure that all applications and application interfaces are working properly. (Hint: This is where you find hidden hard-coded IP addresses for interfaces. Yikes!)
Shut down connectivity to primary instances and fire up the DR environments.

I get it – it’s a lot of work to prepare for disaster recovery, and it can carry a sizeable cost. Once you factor in the cost of downtime and losing data, however, the cost and time to set up DR is justifiable. Addressing that foregone conclusion will ultimately make you sleep better at night.

About the author

Kyle Berry

With more than 30 years of management consulting and IT operations experience, Kyle leads strategic efforts that help clients leverage …

Get to know Kyle

Explore our industries

Learn more about our services

Business insights

Related insights.

Explore more insights

Optimize Technology

Guarding the grid: Cybersecurity in solar and renewable energy

Instead of implementing static, one-size-fits-all solutions in an environment where threats are continuously evolving,…

Read article

Optimize Technology

The digitalization imperative for utilities part 3: Navigating regulatory complexity

New rules aimed at reducing emissions, enhancing grid reliability, and protecting consumer data are being introduced at…

Read article

Optimize Technology

The digitalization imperative for utilities part 1: Overcoming legacy system barriers

Digital transformation is no longer a “nice-to-have”; it’s a critical necessity. Yet, despite this urgency, many…

Read article

Explore more insights