Saturday, January 31, 2009

Storage Efficiency

So, I've been sitting here thinking that with the current economic distress everyone is looking to save money. In the storage business, this means an almost myopic focus on something called "storage efficiency". Everyone wants to get the most "bang for the buck" that they can right now, and they really don't want to talk about much else, and that's really too bad.

I say it's too bad, because for those few who are bigger thinkers, people who are willing to go out on a limb and take a more strategic view of things, right now is a great time to make some changes that will, at the end of all this, leave their business with a stronger, better, more sustainable storage infrastructure. Or better yet, should those at the top of the IT pyramid actually have magically found some stones, they could create an entire IT organization that's better, stronger, and faster than it is now and one that even operates more efficiently than the one they have today.

Unfortunately, what I'm seeing is fear and the result of that is that people are pulling back. They are dragging out or postponing projects, turning the screws on their vendors to reduce costs, and some are laying off people or even going so far as to outsource. I won't even go into why I think that anyone who outsources today is both a fool and a traitor to this county, that's for another time/post.

To those few who have the courage to build instead of tear down. For those who recognize opportunity in the current economic climate, I say bravo. To the rest, I give the Bronx Cheer.

But back to the topic at hand. What I find interesting is that this myopic focus on "Storage Efficiency" on the part of both the consumers of storage and the resulting response from the vendors of storage. All of the big storage vendors have some kind of "Storage Efficiency" marketing strategy going. The blogosphere is full of arguments about how vendor A's storage is very inefficient, and the supporters of vendor A defending that vendor's storage efficiency. In the end, I don't think that any vendor's storage hardware in inherently more efficient, or less efficient, than any other vendors. It's all about how you lay out your applications on that array, how well you manage the space, and how you are able to properly tier the data. In other words, in the end, it's about people. In this case, Storage Architects and Storage Admins who do the grunt work of managing a company's storage infrastructure on a day to day basis. If they are good and are allowed to obtain the tools that they need, you get efficient storage utilization. Otherwise, you end up with very low utilization rates. My fear, however, is that with all of this focus on "Storage Efficiency" from a hardware perspective that those folks in the trenches won't be allowed to get what they need in order to truly make a company's storage more efficient than it is today. Management will fall prey to all that marketing hype and think that if they just switch from vendor A to vendor B that all of their problems will be solved. Oh, and to pay for that switch and since it's going to be soooo much easier to manager vendor B's storage, lets lay off a couple of those Storage Admins we aren't going to need anymore. Again, for those folks I have no sympathy, and they deserve the disaster that's waiting for them just around the corner.

In the end, I think that given the opportunity to do some storage virtualization in conjunction with server virtualization and network virtualization that storage could become very efficient. When you do all three together, you end up with a very efficient data center, as well as a very green data center. Yes, that's right, I said green data center. I fully realize that green sooooo 2008 and no one wants to talk about it anymore (back to that myopic focus on "Storage Efficiency"). But I think that if you look at the big picture, that the more efficient your storage/servers/networks are, the greener they are. That means reall dollar savings folks, so let's not stop talking about "green" just yet.

So, in my opinion, for those that are willing to invest in the future, I say build a "virtual datacenter". Some call it "Unified Computing", some call it "Cloud Computing", and some have other names for it. But as I see it, it's just creating an environment in which business users can run the applications they need in order to operate the business. I think that the "virtual datacenter" would allow for containerized applications. This means that the user's applications including the code and the data, would be in some kind of portable container that could be easily moved, expanded, shrunk, spun up or spun down, depending on the needs of the business. Add to this a way for business users to deploy their own applications into the environment and you completely change the relationship between IT and the business.

Yes, I know this concept isn't for the faint of heart, especially in today's economic climate. But in the end I truly believe what you would have is a much more efficient, flexible, responsive IT organization which has a much better relationship with the business. Heck you might even end up with IT being viewed by the business as something other than just a cost center which needs to be controlled! Yeah, I know, fat chance, but I can dream, can't I?

--joerg

Wednesday, January 28, 2009

Wide striping is a two edged sword

I have spent a lot of time lately talking with some of my coworkers, friends, etc. on the topic of wide striping. This topic keeps coming up since there are now a number of vendors selling storage arrays with SATA drives that claim to have "the same performance as fiber channel". Some of the Sales folks I work with keep asking how we are supposed to dissuade people from that idea, or if it's true. One of the prime offenders in this regard is IBM with their new XIV array. The XIV uses wide striping and SATA drives and they claim to have "enterprise performance" at a very low price point. But they aren't the only ones; you have Dell telling people the same thing about their EqualLogic line of storage as well, and there are other too. For an excellent article about the XIV and its performance claims, take a look at http://thestorageanarchist.typepad.com/weblog/2009/01/1037-xiv-does-hitachi-math-with-roman-numbers.html.

What I usually tell them is that the statement is true; you can get fiber channel performance by striping across a large number of SATA drives. The only problem is that you have to give up a lot of usable disk space in order to keep it that way. A quick example usually illustrates the point quite well. Let's say that for the sake of easy math the average application in your environment uses about 5TB of space (I'm sure some are a lot more, and some a lot less, but we are talking average here). Let's also say that you need about 2,000 IOPS per application in order to maintain the 20ms max response time you need in order to meet the SLAs you have with your customers. Finally, let's also assume that your SATA array has about 90TB of useable space using 180 750GB SATA drives and you can get about 20,000 IOPS in total from the array. So, let's do some basic math here. That means that you can run about 10 applications at 5 TB apiece which will take up about 50TB. So, your array will perform well, right up until you cross the ½ full barrier. After that, performance will slowly decline as you add more application/data to the array.

So, what does this mean? It means that the cost per GB of these arrays is really about twice what the vendors would have you believe. OK, but considering how much cheaper SATA drives are than 15K fiber channel drives, that's still OK, right? Sure, as long as you are willing to run your XIV at ½ capacity. In today's' economic climate, that's going to be tough to do. I can just imagine the conversation between your typical CIO and his Storage Manager.

Storage Manager – "I need to buy some more disk space."

CIO – "What are you talking about, you're only at 50% used in theses capacity reports you send me and we didn't budget for a storage expansion in the first year after purchase!"

Storage Manager – "Well, you know all that money we are saving by using SATA drives? Well, it means I can't fill up the array; I have to add space once I reach 50% or performance will suffer."

CIO – "So let performance suffer! We don't have budget for more disk this year. Why didn't you tell me this when you came to me with that 'great idea' of replacing our 'enterprise' arrays with a XIV?!?!"

Storage Manager – "Ahhh … ummmmm … gee, I didn't know, IBM didn't tell me! But we had some performance issues early on, and figured this out. Do you really want to tell the SAP folks that their response time is going to double over the next year?"

CIO – "WHAT! We can't let that happen, we have an SLA with the SAP folks and my bonus is tied to keeping our SLAs! How could you let something like this happen! Maybe I should use the money for your raise to pay for the disks!"

Storage Manager – "Um, well, actually, we need to buy an entire new XIV, the one we have is already full."

OK, enough fun, you get the idea … make sure you understand what wide striping really buys you and if you decide that the TCO and ROI make sense, make sure you communicate that up the management tree in the clearest possible terms. Look at the applications that you currently run, see how much space they require, but don't base the sizing of your EqualLogic (see, I'm not just bashing the XIV) just on your space requirements. Base them more on your IOPS requirements. With SATA drives chances are pretty good that if you size for IOPS, you'll have more than enough space.


--joerg

Tuesday, January 27, 2009

2009 Outlook

Like everyone else I'm looking at the business climate in 2009, and it makes me nervous. I listen to the news reports of more layoffs and cutbacks that come almost nightly, and wonder what that means to me and to the storage business. I have coworkers who suggest that storage is recession-proof. That no matter what the economy is doing, that data will continue to grow, and thus companies will have to continue to grow their storage infrastructure. I'm not sure that I buy it, but that just might be my nerves talking. Perhaps it's just that I tend to believe that the truth typically lies somewhere in the middle. So, I thought I'd take a minute and describe what I think is going to happen this year. No guarantees, I can't predict the future, but a little speculation is always fun.

Storage will continue to grow just not as fast
Yup, I do believe that the amounts of data that companies keep on hand will continue to grow. Just not at the same rate it has in the past. Depending on whom you want to believe, year to year growth for storage has been growing at 40-60% CAGAR or even more. I'm guessing that in 2009 we are not going to see that kind of growth. Due to the reduced sales volume that most companies will see due to the recession, there's got to be an attendant reduction in the amount of data that gets created. How much is the $64,000.00 question. I suspect that the growth rate might be cut in half, or even more. Add to this the fact that budgets are getting slashed and storage managers are going to be looking to expend the useful life of storage that they have on hand and it makes me think that this year the average growth rate for storage is going to sit somewhere between 5-10%. So, overall I believe that the volume of raw disk sales is going to drop dramatically. I'm probably not the only one looking at things that way, look at the major storage vendors, they are all cutting forecasts, laying off people, and generally cutting back.

It's not all doom and gloom
I think that in this situation, however, there is some opportunity. Storage providers that can help the storage managers at their clients address the issues of their budget reductions and to find ways to do more with less will get quite a bit of business. I also think that companies, like the one I work for, that can package best of breed hardware and software into solutions that are very cost effective will also do well. Vendor loyalty, however, is going to go out the window and companies that were once locked into a single vendor will look at other vendors if they perceive that other vendor as being more cost effective. Again, this means opportunity for vendors to get into companies that they had previously been locked out of. I predict that we are going to see some of the major storage users leave the "big four" (EMC, NepApp, Hitachi, IBM) and moving to storage from smaller players in an effort to reduce both CAPEX and OPEX costs.

The year of storage efficiency and virtualization
Finally, this year it will all be about efficiency and virtualization. I'm betting that CIOs will actually accelerate any server virtualization projects that they currently have in the works in order to get those reduced costs as quickly as they can get them. However, what they will find is that unless they are quite careful, their server virtualization project might result in increased spending on storage, backup/recovery, and DR that they hadn't planned for. This can be overcome to some extent by partnering with storage suppliers that understand the issues involved when dealing in a virtualized world. I also predict that sales of things like data deduplication, and thin provisioning are going to accelerate this year. Again, all of this is in an effort to "do more with less" on the part of storage consumers.
So, overall, I'm cautiously optimistic that for those that can show their customers how to "do more with less" this year will be just a challenge, but in the end the will survive. For those who continue to try and do business as usual, well, they mind for this year to be more difficult.

--joerg

Tuesday, January 6, 2009

IBM XIV Could Be Hazardous to Your Career

So, I haven't blogged in a while. I guess I should make all of the usual excuses about being busy (which is true), etc. But the fact of the matter is that I really haven't had a whole heck of a lot that I thought would be of interest, certainly there wasn't a lot that interested me!

But now, I have something that really get my juices flowing. The new IBM XIV. I don't know if you've heard about this wonderful new storage platform from the folks at IBM, but I'm starting to bump into a lot of flolks that are either looking seriously at one, or have one, or more, on the floor now. It's got some great pluses:

  • It's dirt cheap. On top of that, I heard that IBM is willing to do whatever it takes on price to get you to buy one of these boxes, to the point that they are practically giving them away. And, as someone I know and love once said "what part of free, isn't free"?
  • Fiber channel performance from a SATA box. I guess that's one of the ways that they are using to keep the price so low.
  • Teir 1 performance and reliability at a significantly lower price point.

So, that's the deal, but like with everything in this world, there's no free lunch. Yes, that's right, I hate to break it to you folks, but you really can't get something for nothing. The question to ask yourself is, is the XIV really too good to be true? The answer is yes, it is.

But the title of this blog is pretty harsh, don't you think? Well, I think that once you understand that the real price you are paying for the "almost free' XIV could be your career, or at least your job, then you might start to understand where I'm coming from. How can that be? Well, I think that in most shops, if you are the person who brought in a storage array which eventually causes a multi-day outage in your most critical systems that your job is going to be in jeopardy. And that's what could happen to you if you buy into all of the above from IBM regarding the XIV.

What are you talking about Joerg?!? IBM says that the XIV is "self healing", and that it can rebuild the lost data on a failed drive in 30 minutes or less. So how can what your said be true? Well folks, here's the dirty little secret that IBM doesn't want you to know about the XIV. Due to its architecture if you ever lose two drives in the entire box (not a shelf, not a RAID group, the whole box all 180 drives) within 30 minutes of each other, you lose all of the data on the entire array. Yup, that's right, all your tier 1 applications are now down, and you will be reloading them from tape. This is a process that could take you quite some time, I'm betting days if not weeks to complete. That's right, SAP down for a week, Exchange down for 3 days, etc. Again, do you think that if you brought that box in after something like that your career at this company wouldn't be limited?

So, IBM will tell you that the likely hood of that happening is very small, almost infinitesimal. And they are right, but it's not zero, so you are the one taking on that risk. Here's another thing to keep in mind. Studies done at large data centers have show that disk drives don't fail in a completely random way. They actually fail in clusters, so the chances of a second drive failing within the 30 minute window after that first drive failed are actually a lot higher than IBM would like you to believe. But, hey, let's keep in mind that we play the risk game all the time with RAID protected arrays, right? But the big difference here is that the scope of the data loss is so much greater. If I have a failure in a 4+1 RAID-5 raid group, I'm going to lose some LUNs, and I'm going to have to reload that data from tape. However, it's not the entire array! So I've had a much smaller impact across my Tier 1 applications, and the recovery from that should be much quicker. With the XIV, all my Teir 1 applications are down, and they have to all be reloaded from tape.

Just so you don't think that I'm entirely negative about the XIV let me say that what I really object to here is the use of a XIV with Tier 1 applications or even Tier 2 applications. If you want to use one for Tier 3 applications (i.e. archive data) I think that makes a lot of sense. Having your archive down for a week or two won't have much in the way of a negative impact on your business, unlike having your Tier 1 or Tier 2 applications down. The once exception to that I can think of is VTL. I would never use a XIV as the disks behind a VTL. Ca you imagine what would happen if you lost all of the data in your VTL? Let's hope that you have second copies of the data!

Finally, one of the responses from IBM to all of this is "just replicate the XIV if your that worried". They right, but that doubles the cost of storage, right?