Disaster Recovery in the Cloud – 6 Key Challenges to Building a Data Replication System

Why Cloud for Disaster Recovery? 

The failure domain is a main cloud consideration for many enterprises and businesses who seek an affordable, manageable solution for protecting their data from loss. The ability to replicate business data to a secondary off-site location—without building and managing an additional IT data center and infrastructure—is a great cost- and management effort saving benefit. For smaller enterprises and businesses, the cloud provides implement a disaster recovery plan that would have not been otherwise possible.

Six Challenges and Advantages of Replication in the Cloud

When Zadara™ Storage set out to architect an enterprise cloud storage solution, it was clear that rather than placing a traditional storage system behind (or in) a cloud, our solution needed to be architected from the ground up for the cloud. The cloud’s multi-tenant and dynamic environment provides for unique challenges (as well as benefits); let’s take a look at them:


1. Latency = Asynchronicity

SAN and NAS replication is a mainstay of enterprise storage backup and disaster recovery plans across industries. There are multiple on-premise solutions available, each with cost, performance, latency, and recovery point objective (RPO) advantages and disadvantages.

However, the inherent latency of replicating to a cloud location from on-premise, across regions, or providers, means that mirroring is generally achieved through asynchronous means* in which the remote storage volume is updated at chosen increments, as often as is practical.

This means that unlike synchronous replication where both volumes are always identical but which cannot operate over long distances, asynchronous replication works over any distance but causes a temporary discrepancy between the source and destination volumes every time the source volume changes, and until the destination volume is updated again. An advantage of systems such as Zadara Storage Virtual Private Storage Array (VPSA) snapshot-based asynchronous replication do not impact source volume performance thanks to very efficient snapshot algorithms.


2. Networking and Bandwidth

One of the main challenges of data management in a cloud system is external bandwidth, and in particular its cost and speed limit. We have identified two key elements to help our customers reach maximum efficiency and hence reduce both traffic volume and bandwidth costs:

  • Compression of data: all data shipped is compressed
  • Only differential data is shipped: the system is as thorough, precise and efficient as possible in determining what data has actually changed since the writing of each consistency point (a point-in-time snapshot of the data), and ships the differential data only.


3. Capacity = Cost

Whether on-premise or the cloud, more capacity equates to more cost. One of the unique features of the VPSA Enterprise Suite snapshot-based replication is user control over how often snapshots are created and how many are retained for each respective volume. Users require different frequencies of snapshots and RPO for different applications, volumes, and business needs. By allowing users to choose the frequency on a per-volume basis, VPSA enables further efficiency and control, saving both capacity and cost.


4. Consensus Across Dynamic Systems

A challenge unique to the cloud is that the cloud is a dynamic system in which IP addresses or names can frequently change. This becomes even more challenging when replicating across different systems or providers that employ heterogeneous solutions and interfaces.

The storage replication system must still be able to find its peer and make sure it is in consensus about the state of the mirroring process.For our engineers this meant developing a unique consensus algorithm that coordinates between the peers through the various states and stages, no matter where each peer is located.

This means the algorithm supports mirroring within the same cloud, across different regions of a cloud, in a hybrid cloud, or even across multiple providers (VPSA is currently available at AWS, Dimension Data and CloudSigma, as well as at dedicated colocation facilities around the world).


5. Bi-Directional Replication

Many companies seeking a cost-efficient DR solution still back up to tape or other type of media that is *assumed* to be working as a DR site, but is seldom tested or used.

An element that was critical in designing the VPSA data recovery system was to ensure that the failover site is not only functional, but that the destination site can be used as the primary site.

VPSA switchover can be made frequently and easily in order to not only serve as a DR site but to also leverage the cloud’s agile environment. Such agility is necessary if one wants to be able to react in a timely fashion to changes in regional demand, downtime, or impending severe weather.

VPSA enables bi-directional replication and allows users to link, break and re-merge volumes as frequently as the user chooses. Volumes based on the source volume can be indefinitely resynced to the original source. And no matter where you are in the world, you’re a reasonable replication distance from a Zadara data center – so that at the push of a button you can automatically replicate to many other places on the planet.


 6. Isolating Failures in a Multi-Tenant Environment

The cloud is a multi-tenant environment in which different users and customers share a single system. In many public cloud offerings today, this often translates to performance degradation due to ‘noisy neighbors’ as well as a fear of security breaches and failures caused by other users’ actions.By dedicating resources to each customer, Zadara Storage VPSA is able to not only provide truly private storage and predictable performance QoS, but also the assurance that any failure is isolated within the smallest possible subsystem. This approach prevents the occurrence of systemic or cascade failures in which the cloud (or even large portions thereof) become unavailable.

Zadara Storage VPSA Replication Advantages 

To summarize, some of the Zadara Storage VPSA solution’s advantages and benefits for DR in the cloud are:

  1. One-minute RPO
  2. Unlimited, low-impact snapshots
  3. Bi-directional failover
  4. Isolated, private resources
  5. DR across regions, clouds and service providers
  6. The ability to create a DR pool within a VPSA
  7. Authenticated and encrypted data transfer
  8. Custom snapshot frequency per volume
  9. Fully elastic, paid by the hour storage
  10. Unlimited, zero-capacity, instantly available, writable cloning of DR volumes using any point-in-time snapshot, e.g., for test and development.

You can learn more about the VPSA Enterprise Suite data management features on our website, or register with us to begin a free trial now.

*[We do not discuss synchronous replication because good disaster recovery practices specify a geographical separation among sites so large as to make synchronous replication impractical.]

Suggested further reading:
Asynchronous Remote Replication between AWS US-West and US-East Regions 
This short video will show you how to set up Asynchronous Remote Replication with Zadara Virtual Private Storage Arrays (VPSA), demonstrating replication between in AWS US-East and AWS US-West regions.

Microsoft© SQL© Server DR to the cloud 
Disaster Recovery is a requirement for IT professionals who choose to run Microsoft SQL Server for their critical applications but historically expensive and complex. Inmage® and Zadara Storage have teamed up to make MS SQL servers and application disaster recovery, easy and cost effective. With the public cloud, as the DR site, there is no need to have a dedicated failover site with expensive hardware, software, real estate and personnel – Learn how >

Share This Post

More To Explore