How to Reduce Robocopy Backup Time of SMB Volumes in the Cloud

Concurrent Robocopy Jobs

Even though Microsoft’s Robocopy is multi-threaded, the utility inherently limits the number of outstanding I/Os which it can issue to your VPSA. This is simply due to the fact that it has to read and traverse your directory structure while collecting the data for backup. This causes delays in processing which are exacerbated by network latency.

Even with the fast sub-millisecond latency connectivity between our storage cloud and AWS, the extra delay is at least an order-of-magnitude greater than moving data to and from local disks. That is, inherent Robocopy limitations are not noticeable in the scenarios for which the utility was originally designed, but they become apparent when Robocopy is used over a network.

The important question to ask is “how to work with network latency when adding Robocopy threads has little or no affect?” The answer is to add concurrent Robocopy jobs which can issue parallel I/Os while the other jobs are scanning your directory structure.

Take a look at the results in Figure 1 where we took an extreme case of a large directory structure with lots of small files ranging from 1k to 4k bytes and ran parallel Robocopy jobs.

Concurrent Robocopy Jobs
Figure 1: Concurrent Robocopy Jobs

The backup was performed from the smallest possible configured VPSA to an AWS Workspace (which is a t2.medium instance with low-moderate network capability of ~100 Mbps). Using the default number of threads (/MT:8) as a single job, the baseline copy time took 314 minutes. Adjusting the number of parallel jobs and threads, and running the test seven more times, the best case was 6x faster than what Robocopy normally takes to backup the data.

Due to AWS Workspace memory constraints, more than 42 parallel jobs causes significant process swapping, consequently resulting in worse performance. Even when running Robocopy jobs from a m3.xlarge instance (with plenty of RAM and a high network capability of ~1Gbps), adding more jobs will increase copy times.

In general, we found that running eight parallel Robocopy jobs with /MT:8 performs best for both encrypted and unencrypted volumes while minimizing the impact of accessing and generating production data within your VPSA.

If you desire to reduce backup times, please visit our forum where we have detailed instructions and the Windows Power Shell script we use to run parallel jobs. Or contact us at support@zadarastorage.com for assistance.

If you’d like to test your workloads on Zadara Virtual Private Storage Arrays (VPSA) at AWS or other service providers, you can start a free trial now at bit.ly/registerVPSA

 

Share This Post

More To Explore