Whilst HDS and EMC throw rocks at each other with regards to whether it is better to build custom parts or take things off the shelf and just use custom when you require (I’m expect the other Barry to sit on his hands but there are good reasons why the SVC team decided to build out of commodity parts and I suspect that they are very similar reasons to EMCs). I think we should look beyond the hardware and look at what is coming down the line to us.
The most important thing roadmapped is FAST, Fully Automated Storage Tiering. FAST changes things; it takes a whole bunch of ideas from a whole bunch of places and runs with them. If you are another vendor and you feel aggrieved that EMC have stolen your idea; just take heart, it won’t be the first time in history that this has happened and it won’t be the last.
The foundation is Wide-Striping* using a model which splits your data into chunk(let)s and spreads it across spindles. Once these chunks are distributed, you can monitor the characteristics of the I/O at an individual chunk level; this allows us to do tiering at a sub-LUN level. A hot chunk of data can be moved to a higher tier and a cooler chunk of data down into a lower tier.
In the past we have been limited to moving a whole LUN (with the exception of Compellant); this has always been a time consuming job, identifying what needs to move and then moving it. Yes, technologies have come along to make this easier but to sweat the asset and especially to make best use of SSDs; we needed to move individual ‘blocks’ as in a given file-system , it is possible that only some blocks are hot and frequently accessed. Traditionally if you could, you would hold these in cache but if SSDs are expensive, cache is yet more so. This approach will allow some cache to be replaced by SSDs and for some cache unfriendly workloads, to all intents and purposes, you have massively increased the amount of cache available. You might not want to hold a terabyte or so of real cache for that evil 100% random read app but with SSDs; this becomes viable and not at a huge utilisation hit.
But there are going to be issues with the FAST approach; firstly, where do you put a new workload? If you simply assign it some disk and let the array decide, what the hell is it going to do with the workload? It could put it on the slowest tier possible and then migrate up; it could stick it on the fastest tier and migrate down. Both of these approaches have significant risk, so I suspect we are going to have to give the array some clues and we are going to have to understand more about the whole system we are putting in. The difference in performance between the top tier and the bottom tier is going to be large.
No longer will the Storage Admin be a Lun Monkey; they are going to need to really understand their workloads and the applications. They are going to need to learn to talk the application developers and understand workloads, they are also going to have understand business cycles.
For example applications which spend 11 months of the year pretty idle may suddenly at year end need a lot of performance. What happens if all your applications demand stellar performance once a year? Perhaps you need a way of warning the array that it needs to prefetch a load of data. A badly written end-of-year reporting extracting which generates thousands of random read IOPs. A badly written user-generated SQL; in the past, this just crippled the application; with FAST, this could cripple the whole array as it tries to react.
The FAST approach is potentially the thin-provisioning of IOPs. This going to need a lot of thinking about. Potentially you will have to domain storage to protect applications from the impact of one another. We are going to need to know more about the whole system than we have before if we are to truly benefit from FAST,
Building rules which suit your applications; sure, V-MAX will come with its own canned rules for things like VMware and known applications. Indeed EMC will probably be leveraging all the performance data that they have been gathering over the years to help us write the rules. Storage Templates as described by Steve Todd here are just the start.
So although at one level, the Storage Admin’s job could get alot easier; the Storage Manager’s job has got a whole lot harder. Yes Barry, I asked for FAST and now you’ve given it to us; now we’ll have work out what this all means!
I have some really ‘interesting’ ideas as to where EMC could take V-MAX but they’ll have to wait for another time as I’m still supposed to be on leave from Enterprise IT.
* It’s Wide Striping not Wide StripPing as I keep seeing written; Wide Stripping is what happens on a Rugby Tour after a good night out!

