Recent natural disasters such as Hurricane Matthew have increased awareness and internal pressure to validate the effectiveness of the Disaster Recovery Plan. Industry and government-driven regulations and customers are demanding that organizations provide evidence of recoverability before the organization is given the right to do business.
Like any other major projects, planning is critical for your Disaster Recovery project. Defining a comprehensive set of requirements at the start of your Disaster Recovery project ensures the implementation of the best-fit Disaster Recovery solution. Disaster Recovery solutions are long-term commitments, and mistakes in planning will be amplified during implementation.
In order to determine the best-fit Disaster Recovery solution, we recommend having a roadmap in place. It helps you analyze the pros and cons of Disaster Recovery options and identify the solution that balances your budget and Disaster Recovery requirements. It also helps you avoid common pitfalls by integrating key insights from a variety of organizations that have implemented a Disaster Recovery solution.
Disaster Recovery Project Roadmap
The roadmap enables you to determine the best-fit Disaster Recovery solution from the use of in-house Disaster Recovery sites to managed services (e.g. Disaster Recovery service providers) to Cloud-based Disaster Recovery. The roadmap has three different phases: 1) Conduct Requirements Gathering; 2) Evaluate potential Disaster Recovery (DR) solutions; and 3) Determine the best-fit DR site solution.
Phase 1. Conduct Requirements Gathering
1. Storyboard – Review guidelines for using the tools and templates described here to build your DR Site Strategy.
2. Business Impact Analysis – Prioritize all the systems/applications of your organization into tiers based on their business impact.
3. Risk Catalogue – Discuss potential risks that can affect your primary site and create a Risk catalogue which ranks the potency of each risk.
4. Current State SWOT Analysis – Discuss the current state of your infrastructure. Gather requirements for the ideal DR solution.
Phase 2. Evaluate potential DR solutions
5. Evaluate Options:
a.In-House DR Site – Evaluate the implication of leveraging an in-house DR site.
b. Managed Services Provider (MSP) – Evaluate the implication of leveraging an MSP-based DR solution.
c. IaaS – Evaluate the implication of leveraging an IaaS-based DR solution.
6. Cost of Ownership Analysis – Conduct a thorough cost analysis to estimate the potential cost of each potential DR solution.
7. Force Field Analysis – Analyze each DR solution based on its metrics and weaknesses. Incorporate the perspectives of a variety of stakeholders.
8. Deployment Model Decision Matrix – Based on the Force Field Analysis, rank and shortlist the most viable DR solutions.
Phase 3. Determine the best-fit DR site solution
9. Case Studies – Learn how other organizations are creating their DR environments. Leverage best practices and avoid pitfalls.
10. Document the DR Solution – Based on previous analysis, document the DR solution. Ensure clarity and ease of access for internal and external stakeholders.
11. Executive Communication Tool – Communicate to the executive team the necessary investments and benefits of the ideal DR site.
What do Recovery Point Objective (RPO) and Recovery Time Objective (RTO) mean?
As you go through the phases of a Disaster Recovery project roadmap, you will encounter new terms such as Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Disaster recovery timeline objectives are typically measured as RPO and RTO. In case you are not familiar with these terms, here is a list of these terms and their definitions.
- Recovery Point: Refers to how much data would be lost if you had to revert to your backup data (i.e. how old is your most recent backup?).
- Recovery Point Objective (RPO): Refers to the organization’s maximum tolerance for data loss. An RPO of 4 hours means you are committing to ensuring that your most recent backup data is never more than 4 hours old – i.e. running a backup every 4 hours or less.
- Recovery Time: Refers to how long it takes to restore IT services. If a server goes down and it takes 30 minutes of troubleshooting and another 15 minutes to failover to a standby server, the recovery time is 45 minutes.
- Recovery Time Objective (RTO): Refers to the organization’s maximum tolerance for downtime. An RTO of 4 hours means you have the technology and processes in place to restore IT services within 4 hours. The term “Maximum Tolerable Outage (MTO)” is also used to refer to the same general concept.
What Disaster Recovery options do I have?
The list of Disaster Recovery options below is designed to help you define your requirements and then select a solution that best meets your requirements.
- In-House Site: This is the traditional approach of setting up a secondary site that mirrors the primary site (at least for critical applications).
- Mutual Aid Agreements / Reciprocal DR Sites:
- If you have two or more primary datacentres (e.g. for two separate divisions), each location can function as a DR site for the other, depending on available capacity. This saves the cost of building out a dedicated DR site.
- Similarly, two or more separate organizations (usually related – e.g. two colleges) can enter into the same type of reciprocal DR agreement.
- Commodity Cloud-Based Solutions (i.e. using IaaS): Depending on your workloads and requirements, some or all of your primary environment can be recovered to a commodity Cloud-based environment such as AWS or MS Azure.
- Hybrid managed services provider (MSP)/IaaS Solutions: Leverage an MSP that not only provides space and infrastructure for your physical standby equipment, but also offers IaaS. This can reduce costs by limiting the amount of hardware that needs to be deployed at the DR site.
- DR Service Providers: DR service providers that provide a DR site/environment, including hardware, and DR support services. Depending on the provider, they may also include options to leverage IaaS as part of your solution.
- Pure Cloud-Based DR / Disaster Recovery as a Service (DRaaS): This ranges from configuring a DR environment in a public Cloud using standard offerings to leveraging vendors who offer managed DR services using an IaaS model.
- Combined DR Solution: A combination of more than one solution may be required depending on your needs.
Let Us Help!
The cost of downtime is rising across the board and not having a Disaster Recovery Plan in place is not an option anymore. ProServeIT offers a fully-managed, comprehensive DRaaS program. Contact us for a complimentary consultation to get started! Our team of experts will be happy to work with you to help you improve the resilience of your organization with the best-fit DR solution!