In all honesty, I first submitted the abstract for a talk on Software Defined Storage to SNIA very early this year. It seemed like a different world then, and I really had no idea what I was getting into. For the purposes of full disclosure, I was the sponsor for an SDS project before I left Dell, and so I had spent a lot of time sifting through all the data and the nonsense that surrounds it. It felt apropos to put together a talk on what we had learned about the topic: the use cases, the technologies, the market, and the business. In the intervening months, something happened out there to cause the topic to become an epic technological hot potato. I've now given two talks at two separate conferences, and served on two panels, all in the vain attempt to place a clinical definition on SDS. I don't think that such a definition will be possible. Additionally, I want to make it clear: I am a skeptic when it comes to the products on the market today. Let's talk about my somewhat slanted viewpoint:
SDS is Like Tobacco...
While this may sound like a cruel metaphor, it is apt. You can substitute alcohol, or any other recreational drug here, but the relationship would be the same. There is a short term buzz you get from the products, but there are long term problems and dangers that need to be managed. The sellers are fundamentally hoping that you are adult enough to not go on a bender and crash your car, or destroy yourself in some way, so you can keep coming back. There are use cases where the products work, but one should be very wary of assuming that, in a super-competitive industry, storage arrays are so massively mis-priced that it is economically advantageous to construct one from parts. Integration and testing is a huge task that effectively "de-commoditizes" all of the storage products available today. A bug that has a .1% chance of occurring may seem like an acceptable risk with 4 disks. With 100 disks, you have a different calculus to consider.
One of the comments that I heard at SDC is that we have effectively unlearned all the things that we learned 20-30 years ago about creating resilient systems from cheap disks. This isn't entirely true. What we have learned is that there may be better ways to get storage resiliency at very large scale. It's a sexy concept, and those who think they have the physical scale necessary owe it to themselves to try it. Those without cloud-sized data centers need to consider the possibility that the best bet is to rent disk space from companies that have the mass, i.e. Amazon, Google, etc. Trying to convince someone that they can be Google or Amazon if they only just bought your software, is much like selling steroids to people who don't work out. Nice story, but missing key facts.
The Box Huggers Are Not Who You Think
It has been a long-speculated axiom of storage that the people want boxes. If you want to sell storage, you need to put it in a metal box and sell it. Upon much contemplation and discussion on the topic, I have come to the conclusion that his has more to do with economics than anything else: customers are used to buying capacity and sellers are used to selling in the same manner. In fact, there are very few people who have gotten past the notion of paying for something other than capacity. Generally, this is how storage works: you have some data of a particular size, and you want to put it in storage commensurate with that size. The more hazy your needs are, the more likely you will overbuy and hence overpay for your need.
It should not be a surprise then, that the people who are most religious about clinging to boxes tend to be those who are selling them. In full disclosure, I have to confess that I may be one of this cohort. It really represents the easiest way to comprehend what you are selling and what the customer is buying. You would like to be able to offer X Terabytes of highly available storage, with quantified performance, up to N LUNs, up to M snapshots, etc. This is really an indication of tested limits more than anything else, and it should be construed as a support statement from the vendor to the customer. Without that metal box to test, guaranteeing any level of performance or functionality can quickly spiral into an unbounded problem. Hence, we love boxes more than anyone. Perhaps this is why many perceive that the market is troubled by the encroachment from the cloud scale services such as Amazon's and Google's. Nonetheless, the box hugger's view of the world is that any product without clearly defined and testable performance objectives is fundamentally a toy.
Most Customers Are Actually Data Huggers
In my wanderings over the last decade, I have met many, many customers who buy and manage infrastructure. The one most important commonality among all of them is how tightly the security of their employers' data is tied to their success. The word "security" in this case is not to be understood as only protection from intrusion, but also its availability, performance, and the general control of its destiny. That last item - control - is perhaps the most important of all. Putting their data in the cloud gives them the same detachment that they get from sending their kids off to college. Risking data loss is so unacceptable, that making copies and scattering them in as many places as possible seems to be de rigueur. That is pretty paranoid.
This should be a hint to everyone. There is a mentality evident among many software vendors that their role in providing the aforementioned security can be conveniently redlined above the disks or the hardware platform or the network. I don't think that there is a bias against software only solutions. Rather, it's evident that the data-hugging masses aren't getting the feelings of security that they need to widely adopt the approach. As I said in the panel discussion last week, this is less of a technical problem and more of a business model problem. If a model exists that allows the technology to be delivered to customers with the warm, fuzzy feeling of control, I'm sure this market will find it. I don't see one right now.
So what will happen? Without some major redirection, it's clear that there is going to be a shakeup of sorts in this space. The minority of shops that have the expertise, the mandate, and the spare time, will make their choices. The bulk of those choices will be for free software because that is the easiest way to rationalize the internal support costs of the do-it-yourself approach. In the end, the paying market for many of these solutions will not be large enough to support all the players. The winners will be either open source (i.e. profit-free), or solutions incorporated into existing platforms at no extra cost (watch VMware and Microsoft here). Finally, without addressing the needs of the data hugging mid-market, its hard to see any of these products seeing more than limited acceptance.
I'm pretty flexible with my opinion. I change my mind when the facts before me change. Right now, this is what I believe.