About The Author

Umesh Maheshwari, Co-Founder and CTO

Better Than Dedupe: Unduped! »

By Umesh Maheshwari, Co-Founder and CTO On July 27, 2010 · 9 Comments

In a traditional storage environment, primary and backup storage are separate, and backups are based on copying data. Typically, the whole volume is copied from primary storage to backup storage every week or every day. If stored on backup storage without any capacity optimization, these backups can easily use up many times the space used on primary storage. Capacity-optimized backup storage systems overcome this problem using various techniques:

Deduplication, aka dedupe. Successive full backups have mostly the same content because the change rate is generally small. Dedupe removes this duplication in content by sharing blocks across backups. Global dedupe goes a step further and enables sharing of identical blocks regardless of where they are, including identical blocks at different locations within a backup.
Compression. Compression works on an individual block of data (generally less than 1 MB) at a time, and crunches it down based on commonality within the block. Examples include the gzip utility and various LZ algorithms.

In a previous article, Ajay wrote about the reasons for moving towards converged primary and backup storage. With converged storage, backups are based on volume snapshots. A snapshot is logically a point-in-time copy of the volume, but physically it shares all unchanged blocks with the primary state and other snapshots. There is no copying or duplication of data to begin with, so there is no need to de duplicate. This provides huge savings in CPU, network, disk, and memory utilization than first copying the whole volume and then deduping it back down. One might say that snapshot-based backups are not duped in the first place and don’t need dedupe—they are unduped.

In addition, in Nimble’s converged storage model, all data is compressed, including the primary state and backups. This provides a huge advantage compared to most primary storage systems, which do not compress randomly-accessed application data at all.

Next, I will focus on space usage—not because it is the most important difference, but because many interesting questions arise around it.

Proponents of deduping might assume that dedupe is more space optimized than unduped, because global dedupe is able to share identical blocks across backups as well as within a single backup at different locations, while unduped snapshots only share blocks at the same location. The intra-backup sharing does provide a small advantage for dedupe. However, unduped storage benefits from a bigger advantage: the sharing of blocks between the primary state and backups! In essence, unduped converged storage keeps only one baseline copy of the volume, while separate deduped storage keeps two—one on primary storage and one on backup storage. As we will see, the primary-backup sharing outweighs the intra-backup sharing. Therefore, compared to the total space used with separate primary and deduped backup storage, converged storage uses even less space.

Below I present a mathematical comparison of the total space usage (including the primary state) between the following four types of storage:

Unoptimized daily incremental and weekly full backups
Global dedupe with compression (as in optimized backup storage)
Unduped without compression (as in optimized primary storage)
Unduped with compression (as in Nimble converged storage)

The following chart plots the capacity optimization ratio for each of the three optimized storage types. Capacity optimization is computed as the ratio of the total space used in unoptimized storage over the total space used in the specific optimized storage type. Higher values are better. (This ratio ignores the higher cost of primary storage compared to backup storage, and therefore significantly understates the advantage of converged storage, which uses less expensive storage.) The x-axis indicates the days of backup retention. In general, capacity optimization improves with retention.

The chart shows the following:

Deduping is a fine and necessary optimization for separate backup storage.
Unduped converged storage without compression is not as effective as deduped storage with compression.
Unduped converged storage with compression saves significantly more space than deduped storage with compression for typical backup retention periods of 30–90 days. In fact, dedupe would catch up with unduped in terms of capacity savings only if backup retention is longer than 8 months.

Of course, data protection is not complete without provision for disaster recovery, which requires an off-site replica. Comparisons similar to the one above can easily be made that include the space used on the replica. Unduped converged storage with replica retains a lead over separate deduped storage with replica, regardless of whether the primary or the backup storage is replicated. This is because unduped storage with replica has two baseline copies of data (one on converged storage and the other on replica), while deduped storage with replica has three (one on primary storage, one on backup storage, and one on replica).

Interestingly, matching the space saving of dedupe was not our top motivation for building converged storage. The major motivations were the following:

Ability to directly use backups and replicas without having to convert the data from backup to primary format.
Avoid massive transfer of data from primary to backup storage.
Enable significant space savings without performance impact for randomly accessed data, such as databases.

Nevertheless, it is good to demonstrate that unduped storage is not just as good as deduped storage in saving space, it is even better!

Share →

Buffer

9 Responses to Better Than Dedupe: Unduped! »

chad says:

August 2, 2010 at 8:57 am

Why not converge, compress, AND dedupe? How is this really different than just calling any vendor’s snapshot-capable system a “converged solution” because it can store snapshots for weeks or months?

In terms of data security, how do you ensure discreet blocks for protection? (I.E. how does a RAID set failure not corrupt both your production and “backup” bits)? I do see the case for replicas in place of traditional backup methods, but to truly consider a backup with an offsite component would you not need three devices, one for performing “local” restoration in the case of an array/RAID group/flash failure and another for “remote” recovery in DR scenarios?

In traditional backup environments, you had your “production” bits and several additional “full backup” bits. Incrementals required the full backup to function, so were not additional copies. In the VTL realm, you have the production bits and the backup bits on the second device, then the DR copy replicated to the third site, leaving three discreet sets of blocks on three discreet devices. In any case, if the primary failed, the backups were generally available, and if the backups failed, the primary copy was generally available.

In the Nimble model, I’m not 100% certain that all production data is stored both in flash and on disk, as it seems “cold” bits can be relegated to disk-only storage, leaving you with just one block away from catastrophic failure since the backup and production uses both rely on this same block. Assuming a remote copy, recovery would involve procuring repaired equipment and then replicating the data back to the primary site. This doesn’t seem comprehensive without that key third Nimble device in the local site to provide local backup and recovery capabilities.

Reply
- Umesh Maheshwari says:
  
  August 9, 2010 at 8:24 pm
  
  Chad, good comments and questions.
  
  How is Nimble’s converged storage different from any vendor’s snapshot-capable system? Nimble snapshots are compressed, stored on inexpensive high-capacity drives, use relatively small and application-tuned block sizes for optimal sharing of blocks, and do not require extra IO as with traditional copy-on-write snapshots. Furthermore, the primary state is also compressed, stored on inexpensive drives, and shares blocks with snapshots. These features enable the Nimble converged storage to cost-effectively store a much larger number of snapshots that other primary systems.
  
  Why not converge, compress, AND dedupe? It’s doable, but we found that convergence (snapshot based backups sharing blocks with primary) provides the biggest bang for the buck, followed by compression. Adding dedupe in addition to converged storage frees up at most a third of the space—often much less! (This is different from adding dedupe to secondary storage, where it frees up a lot of space because of duplication between full backup copies.)
  
  Don’t we need three devices too? We have found that most organizations do not need three devices to meet their data protection needs. The Nimble array is built to protect against common failures: it has redundant controllers, power supplies, fans, etc. Data on disk is protected with dual-parity RAID. Data on flash does not need to be redundant because flash is only a cache and the data is always backed up by data on disk. (We do detect corruption of flash data using checksums and self description.)
  
  In the unlikely case that the whole array goes down because say it catches fire, the user will need to recover from a replica. The Nimble replica has a copy of data in the native application format, ready for direct use by the application. This is different from a traditional backup, which is stored in some archival format on backup storage, and needs to be first converted to the native format and copied to primary storage by the backup software before it can be used by the application. This difference makes using a Nimble replica much easier and quicker to use.
  
  Nonetheless, more demanding organizations might indeed want a local replica in addition to a remote replica in order to recover even more quickly from a local replica.
  
  Reply
chad says:

August 10, 2010 at 8:00 pm

When evaluating backup systems, I have a tendancy to “chase the blocks” to decide how many layers of protection there are.

I’m interested if there are any sort of industry insights as to answering the question: “How many replicas replace a traditional enterprise backup system?”, however, as I have yet to see any real solid published research in this area. My sense is the consensus is closer to three disparate systems than it is to two, including the production data as the first copy.

As for dedupe, I understand where compression has an advantage over dedupe in many situations (databases, for example), but there are others (like virtualization, for example) where dedupe (even the post-processing type) can also give huge gains. These gains are then enhanced further by additionally leveraging the flash in the system.

Reply
- Umesh Maheshwari says:
  
  August 11, 2010 at 2:53 pm
  
  Chad,
  
  The number of data copies is an important metric. I don’t think it takes two replicas to replace the reliability of traditional backups. Traditional backups are in archival format and need to be restored to application format by backup software, and this bulky process introduces unreliability.
  
  However, if one really needs to recover from whole-system disasters very quickly, cascaded replication provides the ability to replicate to a local replica and then to a remote replica.
  
  Re dedupe: there are some primary applications that benefit from dedupe. However, most virtual desktop environments are better served using volume clones with shared blocks than with dedupe.
  
  Reply
Jason says:

August 24, 2010 at 5:27 pm

This is an interesting concept but I agree that there doesn’t appear to be a significant difference between iSCSI SAN vendor appliances with SnapShots and “converged storage”. True, not all SnapShot technologies are created equal but it looks as thought the only advantage to the Nimble storage system is that you also compress these SnapShots. Couple of questions:

1. How would you compare this to a CDP system sitting beside the primary storage? I think this would answer Chad’s question about having 3 systems. In most cases I’d want an immediate recovery option of data in the event of a system failure. DR is looked at only if there is a location/site failure.

2. Your site is great and the message delivery is simple but there is one thing I’m confused about. Is Nimble Storage compressing primary stored data or just the snaps?

3. How do you protect against cache failure or facilitate node fail-over with active cache? Also, what type of algorithms are in place to speed up write IO? Read IO? Any performance metrics published on this technology yet?

It’s very interesting and we’ll be taking a look but I think you’re going to have to get some more details behind the marketing.

Reply
- Umesh Maheshwari says:
  
  August 25, 2010 at 1:34 pm
  
  Jason, thanks for your comments and questions.
  
  Re local recovery from a system failure: A local replica provides a cost-effective solution. Snapshots are replicated from the primary to the local replica based on a schedule configured by the user. A subset of these snapshots on the local replica can then be replicated to a remote replica based on another schedule. Other solutions such as CDP are costlier and more complex. E.g., CDP needs to mirror writes without degrading performance and store the mirrored stream in real time.
  
  Re compression: We compress both primary and snapshot blocks. This provides 2—4x savings in space. Furthermore, unlike other storage systems, we store both primary and snapshot blocks on low-cost, high-capacity disks. Finally, as you noted, not all snapshots are created equal. Even independent of compression, our snapshots are more efficient than most commonly found implementations.
  
  Re cache failure: The system degrades gracefully with cache failure. If one or more SSDs fail, the system will keep functioning with a proportionally smaller cache. The cache is shared between the two controllers so that, on a controller failover, the cache is preserved.
  
  Re read and write IOPS: Reads benefit from the large flash cache. Writes benefit from NVRAM and sequentialized layout. Both reads and writes also benefit from cached indirect blocks.
  
  I’d also recommend a look at the white paper on our website, which describes our architecture and how we achieve extremely good random IO performance.
  
  Reply
تعلم البوكر says:

September 4, 2010 at 10:19 pm

I LOVE THIS BLOG!!!

Reply
Rich says:

May 25, 2011 at 5:01 pm

The convergence of storage and backups is something that has been underway for sometime now, and many companies are doing pieces of it but not in a streamlined manner. The Nimble design is very intriguing.
I do wonder if there is an issue with the design going above the couple dozen TB size, into 100s of TB and Petabytes.

Reply
- admin says:
  
  May 26, 2011 at 6:10 pm
  
  I see no issue in scaling the Nimble design of keeping unduped backups. On the other hand, systems based on deduplication generally require in-memory data structures that need to grow linearly with the size of storage. These data structures limit the scalability of such systems significantly.
  
  Umesh Maheshwari
  CTO
  
  Reply

Webinar Series

Resource Library

Umesh Maheshwari, Co-Founder and CTO

Better Than Dedupe: Unduped! »

9 Responses to Better Than Dedupe: Unduped! »

Leave a Reply Cancel reply

Categories

Follow us on Twitter

Archives

Quick Links

Solutions

SmartStack

Community / Support

Connect