When I talk to customers about Infinio, it often helps to explain the landscape of storage – where the industry is today. Shortly after I joined Infinio I quickly formulated a 5 category taxonomy for the emerging storage landscape: Hybrid arrays, All-flash arrays, Software-defined Storage, Hyper-converged Storage, and Server-side Storage Services. I presented this model at BriForum.
However, I kept returning to the term “Software-Defined Storage” because it seems to have all the makings of an over-hyped term that quickly becomes nebulous. Lots of companies want to label their offerings with it, but is it just marketing hype rather than a proper architecture?
More recently, I’ve evolved my thinking. I’ve recently concluded that from the storage technology viewpoint, Hyper-Converged Infrastructure and Software-Defined Storage are really the same thing, which has changed the structure of my taxonomy significantly. Now I start with traditional dedicated storage array architectures and place both hybrid and all-flash arrays there. As a contrast, I then discuss server-side virtual storage stacks that include both Hyper-Converged Infrastructure and Server-side I/O Optimization offerings. You can see me present this version at my VMworld vBrownBag session.
Changing Server-side Storage Services into I/O Optimization is an easy change –this category where Infinio fits is incredibly nascent and while its description is clear (optimize any storage array with resources on the server), its category name is still being figured out. Whether it’s Server-side Storage Services, Storage Acceleration, or I/O Optimization, it’s all the same thing.
Deciding that Software-defined Storage and Hyper-converged are the same thing is a bigger jump. Let me share how I got there.
First, Let’s start by looking at what is commonly called Hyper-Converged Infrastructure. Hyper-Converged Infrastructure is primarily a packaging/sales concept. The vendors in this space sell an integrated hardware and software “building block” bundled together. The hardware is typically a server with direct attached storage disks and PCI-e flash cards. All the software needed to run virtual workloads is packaged as well, hypervisor, system management, configuration tools, virtual networking, etc. And there is always a software storage stack bundled with the offerings that virtualizes the disks and flash hardware providing a virtual storage array including management and all the associated storage services to the virtual machines. Let’s look at a few of these storage software stacks:
- Nutanix®’s hyper-converged appliances contain a proprietary Virtual Storage Appliance VM that is the source of a significant amount of their initial IP. It is modeled on the scale-out architecture of Google File System (GFS), provides many of the services associated with storage arrays such as in-line dedup, compression, snapshots and replication. Company sources have said it was built from a heavily modified version of the open source Cassandra technology.
- Simplivity® is another notable company in this space and they also have an integrated storage appliance packaged with their offering. They have included custom hardware in their approach that their virtual storage software uses to offload the compute load associated with dedup-style hash calculations, compression and replication.
- Which brings us to VMware itself and the VMware EVO: RAILTM architecture. EVO:RAIL is Hyper-Converged Infrastructure Software available to hardware partners that wish to offer all VMware software components in their solution. The Virtual Storage Stack here? VMware Virtual SAN®.
VMware’s inclusion of VSAN as its storage stack in EVO:RAIL really hit home to me that from a storage architecture perspective Hyper-Converged Infrastructure and Software-Defined Storage were the same thing and I should talk about them together. What about other Software-Defined Storage that is not packaged as Hyper-Converged Infrastructure? Nexenta® Nexentastor and EMC® ViPR® are prominent examples of technologies that their owners like to call Software-Defined Storage. They’re certainly storage software, but does that make them Software-Defined Storage?
- EMC ViPR was launched talking about separating the “data-plane” from the “ control-plane”, which sounded a lot like the VMware/Nicira/NSX Software-Defined Networking story to me! 😉 ViPR is a meta-framework that aspires to be the “control-plane” for a variety of storage arrays and pass through the “data-plane” while also adding a Restful Object interface sitting above the arrays. It seems to me that its main use is to knit together all the different management frameworks that EMC has shipped with their different array lines over the years (oh, yeah, we can manage NetApp® Filers too!). On this last point, it sounds to me a bit like using Microsoft System Center to manage VMware vSphere, doesn’t it?” 😉
- Nexenta’s NexentaStor flagship offering is a Solaris-based, ZFS software appliance that speaks NAS protocols such as NFS. It can run either as a VSA on a hypervisor or as a physical appliance on dedicated hardware. Nexenta’s stack delivers a virtual hybrid array that uses PCI-e flash as a read cache/write log buffer and hard drives for capacity. Nexenta and their CEO/my friend Tarkan Maner (@TarkanManer) likes to market themselves as “the leader in Software-Defined Storage”. It is a complete storage stack and has all the advanced storage array features you’d expect.
The thing about Nexenta is that I wouldn’t call it either software-defined storage or Hyper-converged Infrastructure – I would say that it is virtual scale-up hybrid array software. (BTW, in a similar vein, I would call VMware VSAN virtual scale-out hybrid array software.) Even on dedicated hardware I wouldn’t call Nexenta Software-Defined Storage or Hyper-Converged Infrastructure; I’d call it a Hybrid Storage Array and/or NAS appliance!
My current view of the storage landscape is that there are 2 big categories – Dedicated Storage Appliances and Server-Side Storage, differentiated primarily by whether or not the application is co-located on the same servers as the storage stack. Traditional, Hybrid and All-Flash Arrays are all examples of dedicated storage appliances. And we have server-side storage, which includes all forms of storage stacks that are executing across one of more servers and co-located with the consuming applications. These include the storage components of Hyper Converged Infrastructure offerings, freestanding virtual storage stacks, and storage optimization technologies.
In summary, I think Software-Defined Storage is the latest in a long line of tech buzz words that’s been overly generalized to now refer to any storage software and hence become close to meaningless. So, I’m going to phase out using it and refer to the storage stacks in hyper-converged Infrastructure as just that, regardless of their packaging.
What do you think?