It’s a privilege for me to help realize the vision of converged storage put forth by Varun. Similar ideas have been implemented by other innovative vendors, but not with the architectural foundation needed to support them fully. At Nimble, we are fortunate to be able to leverage technological advances such as flash memory and multi-core CPUs to do a clean-slate design which addresses real pain points of storage users. And, we were able to assemble a team of seasoned architects to build it.

Our recipe for converging primary and backup storage is simple and has two parts.

1. Capacity optimization: Storing backups for 30–90 days needs lots of capacity. In a system not designed to store backups, they can easily use 10–20x the space used in primary storage. We handle this problem as follows:

  • Store all data on high-capacity disk drives. These disks have over 3x the capacity and 1/6x the cost per GB of high-performance disks. They also have only 1/3x the performance of high-performance disks, but we deal with that separately. (The high-capacity disks have often been called SATA disks, but that is quickly becoming a misnomer as high-capacity SAS drives enter the market.)
  • Use data reduction techniques such as compression and block sharing. These techniques can reduce the space used by backups by 10–20x. Block sharing can take many forms, e.g., snapshots and dedupe, and it is important to pick judiciously based on the context. I will write further about this in the next article.

2. Performance optimization: Especially random IO performance. Common business applications such as Exchange and SQL Server generate lots of random IO. Hard disks are generally bad at random IO. High-capacity disks are particularly bad. We use two techniques that more than make up for this slowness:

  • Accelerate random reads using flash as a large cache. Most storage vendors have a story around using flash. However, flash has some peculiar characteristics, and how a system uses flash is more important than whether it uses flash. In particular, flash is not a performance cure-all; e.g., it might not be cost effective in accelerating random writes.
  • Accelerate random writes by sequentializing them on disk. This technology has been known for some time as log-structured file systems, but it has become more interesting recently because of new enabling technologies.

I will be writing more about these and related issues in this blog. Your thoughts and questions are most welcome.

Share →
Buffer

3 Responses to A Clean-Slate Approach to Converging Primary and Backup Storage »

  1. Ajay says:

    Umesh,
    Interesting post! Went through the white paper and liked it. Since the website talks about optimizing for virtual environment (VMware, Hyper-V installations), one of the key requirements for consolidation is isolation and performance controls of some sort. Was wondering if there are any such capabilities being provided by CS series?

    Regarding SSDs – do they act mostly like a cache or store exclusive copies of data/metadata as well? The documentation shows them as mainly a caching layer.
    Look forward to more posts!

    • Umesh Maheshwari says:

      The Nimble CS-Series enables the user to switch caching on/off on a per-volume basis. (And it does provide control over per-volume space usage with quotas and reserves.)

      Our SSDs cache data and metadata, and this data/metadata is always also stored on hard disks. In other words, the SSDs never contain exclusive copies of data/metadata. This enables us to tolerate loss of flash.

      – Umesh

  2. [...] a new filesystem architecture called CASLTM (Cache Accelerated Sequential Layout). As I described in a previous post, CASL was designed from the ground up to provide a powerful combination of capacity and performance [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>