When people talk about cloud storage, they usually refer to storage that is accessible from outside the cloud, to store files, pictures, and backups. Sometimes referred as object storage, a good example of this type of cloud storage is Amazon S3. But there is a different, and more sophisticated, type of cloud storage: the storage needed by cloud servers to run their applications. This storage is accessed from inside the cloud and used by the cloud servers to mount their filesystems and databases. In Amazon AWS’s terminology, this is called Elastic Block Store (EBS). A MySQL database or MS NTFS are good examples of a database and filesystem that will use a block storage device and not object storage.
Cloud providers are implementing different techniques and products to supply their block storage needs. Some have different advantages, but overall the techniques fall short in meeting even the minimal requirement of enabling enterprise-class applications to use them.
The list of attributes that are important for enterprise customers are:
1 – Predictability of performance (do my performance vary when other customers use the cloud?)
2 – High availability (what happens if the server or storage fails – can I access my data?)
3 – Control (can I control the level of protection, caching used, or type of drives used?)
4 – Security (can people outside my organization read my data?)
5 – Dynamic expansion / elasticity (is it easy to add more storage to my cloud servers?)
6 – Portability (can I use the storage in the cloud without rewriting my applications)?
7 – Features (do I have storage features such as remote mirroring, snapshots, and thin provisioning that my storage in the datacenter has?)
The current storage products were designed to be single-tenant, that is, a single customer using the storage. When used in a multi-tenant environment like the cloud they encounter many limitations. The same storage box that has all the right capabilities in the datacenter, if used in the cloud, loses most of them. Deploying SAN storage or a NAS scale-out system as the back end storage of a public cloud is a good example. Due to the lack of multi-tenancy features, the management of the SAN/NAS box cannot be given to any user, but must be done by the cloud provider. The drives, CPU and memory are shared among many customers. So the performance becomes inconsistent and hard to predict.
Another limitation is the lack of “shared storage” in the cloud. Any clustered application that requires shared storage like Oracle RAC, Linux HA or MS Failover Cluster cannot run in the cloud.
Finally, SAN arrays or NAS storage can be used for Disaster Recovery in the datacenter by providing remote mirroring capabilities. But DR strategies require the user to control the storage so they can effectively do the DR tests and decide how to failover between sites. Well, the same storage product, when used in the cloud, lacks all these capabilities as there is no possibility to give the management of the storage to every customer such that they can test their DR. Even more than that, there is no way to consistently mirror or snap multiple volumes of a single user at the same time, when they are mounted to different cloud servers.
For these reasons, and many others, storage products need to be reinvented to be multi-tenant in order to be used in the cloud and provide the same functionality as when deployed in the datacenter. Multi-tenant means that multiple users can use the system, but they don’t lose the capabilities they had in the single-tenant environment. It means that one user cannot affect performance of the others, and it also means that each one of the users can have full control over his own storage, like they have today with classical SAN/NAS storage arrays in the datacenter.
This is the gap we saw in cloud storage, and it’s why we founded Zadara Storage back in 2011 with a vision to create a new storage architecture, reinvented for a ‘data-centric, location-agnostic’ world.