Joerg's IT Blog: 2013

Monday, November 25, 2013

Forecast, Cloudy with a lot of public and a touch of private...

If you believe that the pundits tell you, then private cloud is all the rage for enterprise customers. Certainly, if you look at what we've been doing here at EVT, there seems to be some evidence to suggest that's actually true. Our Enterprise customers seem to all be either interested in, currently deploying, or running some kind of private cloud.

Forrester Research says that 31% of Enterprise customers already have a private cloud in place and 17% plan to build one over the next year. However, when you dig down a little further, what you'll find is that only 13% actually have something that fits the "true" definition of private cloud. Most have some kind of virtualization in place with some added software to help manage that virtual infrastructure. But, more often than not, those so called private clouds are missing some key elements of a "true" private cloud.

Part of the problem could be that IT has a very loosey-goosey definition of what private cloud really is - and therefore what it brings to the table for IT and for IT's customers. The National Institute of Standards and Technology (NIST) says that for Infrastructure as a Service (IaaS) to be considered as a cloud it must have 5 attributes:

On-demand self-service
Broad network access
Resource pooling
Rapid elasticity
Measured service

IT's definition of a cloud is often very different, and can vary from "We have a data center" to "we look just like Amazon Web Services". But without the 5 essential characteristics above, IT will not be able to achieve the goals of going to "The Cloud" that most people are trying to achieve. The scalability, elasticity, and cost savings that the public cloud promises to the business customers of IT are the real goals that IT should be looking to match with the private cloud.

So why is public cloud growing so rapidly? AWS was a $2 billion business last year, and they are predicted to double that this year. Yet, as you can see above, private cloud seems to be struggling to gain traction in the data center. Especially when you consider the number of data centers that have a private cloud in name only (PCINO). I suspect that there are a number of reasons.

First, moving to a true private cloud is a very difficult cultural and organizational hurdle for most IT departments. It really means a shake-up of IT at the most fundamental level all the way from the top to the bottom. It means that IT will truly have to morph into that service organization that they have been trying to morph into for a long time, and many have yet to reach. That's the cultural change. They also need to change from an organizational perspective. They need to move away from vertically siloed departments within IT such as Server, Storage, network, etc. to horizontally organized departments is key to IT achieving the results that they desire and to be able to compete successfully with public cloud providers.

It should be noted here that IT often attempts to "cheat" the organizational change by "matrixing" people from existing IT departments into new "cloud" organizations. This often leads to failure since those "matrixed" people often bring with them old ideas about how things should be run as well as old processes and procedures.

This change also must go beyond just IT. For example, the purchasing department must understand the new model for purchasing converged infrastructure. For example, they can't be allowed to "break up" the converged infrastructure and purchase the individual components through old, existing vendor relationships. This continues on into IT as well, converged infrastructure means just that, converged. This often means that equipment that was traditional purchased directly from the manufacturer may now be part of the converged infrastructure stack and thus will be purchased as part of that solution. These old relationships with vendors and manufacturers often get in to way to achieving "true" cloud.

So, IT's inability to make the cultural and organizational changes to successfully compete with the public cloud is one reason I believe that private cloud adoption is where it is today. A related reason is that in some cases IT recognizes the issues, and actually starts to utilize the public cloud to deliver services to their end users. This is often an attempt to reel back into the fold "shadow IT" that has already deployed solutions in the public cloud. This if often followed quickly by a discussion on IT's part of hybrid cloud. In many cases that's because IT feels it just can't compete against the public cloud for all applications, and thus comes to the reluctant conclusion that rather than lose the entire pie, it's willing to give some part of the pie to the public cloud and build a private cloud for the rest. There's also an unspoken idea on the IT of IT that once they get their hybrid cloud up and running, they they will eventually prove to the business that they are better than public cloud and thus a majority of the application will move into that private cloud over time leaving only a small hand full of applications in the public cloud.

In the end, I think that unless IT can address the barriers to private cloud discussed above, that their dream of making the public cloud a temporary home is actually just a pipe dream. But in either case, IT's future is one in which they are a service provider and service director that helps the business find the best, most cost effective home for their applications.

Sunday, September 8, 2013

Is OpenStack ready for prime time yet?

For those who've been reading this blog for a while, or who know me, you know that while I've been in the data center business for a long time, that lately I've been focused on storage and backup. However, over the last couple of years I've been watching the infrastructure business change. What I find interesting is that what's old is new again!

When I first started out in "Open Systems", network, server, and storage was all managed as a single entity. So, here we are again. A "pod" or stack is just network, server, and storage all managed together, as a single entity. The new wrinkle here is that we also size them as a single entity which provides a number of advantages. But that's for another blog. As a matter of fact, I plan to write a couple of blogs on IaaS/PaaS/SaaS, how to move successfully to "the cloud", and data protection a cloud environment.

In this blog, I want to talk about one of the "stacks" called "OpenStack". The first questions I get asked when I first begin to talk about OpenStack is, what's the difference between a "stack" and a "pod"? Why is it called OpenStack and not OpenPod? The confusion is quite understandable, since the amount of hype and marketecture around everything having to do with "the cloud", including this topic, is enormous. As a matter of fact, it's so bad, that some of the terms are, in my opinion, starting to become meaningless. So I like to start out any discussion of any of these topics with a couple of definitions so that the audience and I are on the same page. According to Wikipedia:

OpenStack is a cloud computing project to provide an infrastructure as a service (IaaS). It is free open source software released under the terms of the Apache License. The project is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012 to promote OpenStack software and its community.

This begs a definition of IaaS (Infrastructure as a Service) again form Wikipedia:

In the most basic cloud-service model, providers of IaaS offer computers - physical or (more often) virtual machines - and other resources. (A hypervisor, such as VMware, Hyper-V, Xen or KVM, runs the virtual machines as guests. Pools of hypervisors within the cloud operational support-system can support large numbers of virtual machines and the ability to scale services up and down according to customers' varying requirements.) IaaS clouds often offer additional resources such as a virtual-machine disk image library, raw (block) and file-based storage, firewalls, load balancers, IP addresses, virtual local area networks (VLANs), and software bundles. IaaS-cloud providers supply these resources on-demand from their large pools installed in data centers. For wide-area connectivity, customers can use either the Internet or carrier clouds (dedicated virtual private networks).

Note that IaaS can also be implemented in a private cloud (in your data center), or in both the public and a private cloud called a Hybrid Cloud. This ability to utilize the resources of both a private cloud, and a public cloud, is becoming more and more interesting to large enterprises. Again, more on this in a later blog where I will talk about the economics of "cloud".

OK, so enough of laying the groundwork. Let's talk about OpenStack, and see of we can answer the basic question, is it ready for "prime time"? Can I use it in the enterprise to implement my private cloud IaaS infrastructure? The answer is, maybe. Let's talk about it a bit.

First, clearly the interest in OpenStack is definitely growing, and growing quickly. You can see this by looking at the attendance of The OpeStack Summit which started out life with a $15,000.00 budget, and 75 people were basically coerced to go. The most recent OpenStack Summit had a $2 million budget and over 3,000 attendees. So, clearly, interest is up, but no where near the kind of interest that VMware has managed to get. The most recent VMworld had over 23,000 attendees. So, no doubt, lots of interest. But what's driving the interest? Obviously, cost is a big consideration. Since OpenStack is open source, the cost of implementing it is significantly lower than for any of the commercial software out there. But are there hidden costs that perhaps make it not as good a "buy" as perhaps one might think at first blush? the short answer to that is "yes", just like it is with any open source software. Things like support costs as well as the cost of finding/training staff, etc. all add to the TCO of any open source solution, including that of OpenStack.

But lets talk about OpenStack itself a bit. One of the things that I think was holding back OpenStack was the difficulty of deploying the solution. However, this is rapidly being address by software such as Canonical's Juju. There are also a number of companies that provide IOpenStack based solutions such as Pistson OpenStack. Piston provides a turn-key OpenStack solution that includes:

The other way we can tell if anything is ready for prime time is if we look at existing adoption of the technology. A year of two ago, there were almost no enterprise implementations of openStack outside of some service providers such as Rackspace, as well as NASA. This has changed, companies such as Bloomberg, Comcast, and Best Buy have all implemented OpenStack.

At the most recent OpenStack Summit Bloomberg CTO Pravir Chandra, one of several company executives who detailed their real-world experience with the platform at the summit, said his team set a high bar for OpenStack. Bloomberg’s goals included capabilities such as high availability, no cascading failures, and smooth scale down and scale up. As described in GigaOM:

"They were able to get there by deploying OpenStack along with considerable custom work of their own, both above and below that layer. They ended up setting up the high-availability databases and figuring out how to aggregate logs from the hypervisor level."

A story about Best Buy in ITWorld describes Bestbuy.com as "the poster child for organizations that can benefit from the cloud." The online retailer built an internal cloud on OpenStack that the company says speeds up the ecommerce site, allows faster development cycles, and scales.

For example, at the beginning of the Christmas shopping season last year, Bestbuy.com saw a spike of eight times its normal traffic, Joel Crabb, chief architect, told ITWorld. "If that doesn’t scream out for elastic scaling, I don’t know what does."

OpenStack also dramatically cut costs for Best Buy, company executives told summit attendees. Director of eBusiness Architecture Steve Eastham said past releases of the website cost about $20,000 to provision a single managed VM. With OpenStack, he said, the company is spending around $91,000 per rack.

So I think that it’s still an open question about how OpenStack will ultimately stack up against Amazon Web Services in the public cloud infrastructure sector and VMware in the (mostly) private cloud market, where legacy applications are in play. But OpenStack evangelists like Rackspace CTO John Engages are gearing up to bring their solutions to enterprise customers. In an interview, he told Ryan Cox:

The enterprise community is thirsty for the cloud and that ball will soon drop. The opportunity to innovate in open source with OpenStack is one that the legacy solutions in enterprise will soon be eaten. Mobile devices, Big Data, your and my Internet of Things … access to all of these through infrastructure that can scale quickly at low cost is a common theme we’re hearing at the OpenStack Summit 2013.

So, back to our original question, is OpenStack "ready for prime time"? I think that the answer is, maybe. If you're looking to build a private cloud infrastructure, I think it's a ready option. If you're looking for a hybrid solution, it's a bit less clear, but it's certainly possible.

Let me know what you think in the comments. I'm particularly curious if our involved in a OpenStack deployment.

Monday, June 3, 2013

Upgrading Your Storage Microcode

Folks,

I was just reading a posting by Chris Evans on this topic at http://architecting.it/2013/06/03/managing-microcode-upgrades/ and he makes a lot of great points. I agree with everything that Chris posted, only I would go even further and say that based on my experience that having a regular process for upgrading your storage microcode is critical to managing any storage environment.

There seem to be three competing philosophies that cause problems on this topic "in the wild":

"If it Ain't Broke, Don't Fix It!" - This is the idea that you should only patch or upgrade your storage infrastructure if you run into a problem. I run into this approach more often than you would think, and invariably what this means is that you will run into every problem that exists in the microcode and have to deal with it on an "emergency" basis. It also means that you will often go for long periods of time without patching or updating, and then when you hit a problem, you have a huge jump, which almost always means that you also have a lot of servers that need HBA firmware and/or driver updates. This usually ends up being aHUGE and painful project, that, in some people's minds simply confirms why they are avoiding doing the storage microcode upgrades in the first place. What they don't realize is that the main reason it's so painful is that they are so far behind. If they actually kept up, then the pain would be less and spread over time.
"Pick a standard, and keep it as long as possible" - This approach is one I see fairly often as well. Here the storage team picks a "standard" version of the OS, ans sticks to it only patching it when there is a problem, or until they are forced to change because new hardware doesn't support than version of the OS any longer. Then they adopt the new version of the OS as their standard, and bring everything up to that level. It's actually similar to #1, and suffers from the same sorts of issues.
"Apply every patch and/or upgrade the vendor releases as soon as it becomes GA" - I see this much less frequently mainly because people are afraid, often rightfully so, that patching/upgrading this frequently will cause more problems than it solves.

The process that Chris outlines in his blog post, is, in my opinion, the right way to go. Apply your patches either quarterly, or twice per year in predefined upgrade windows. This doesn't mean that you can't apply patches to resolve specific issues as they arise.

But I would go a bit further in my definition of the process. Specifically, I would have a process that works something like this:

Between upgrades (i.e. during the quarter or 6 month period between upgrades) I would pull down every patch and upgrade that the storage manufacturer releases into the storage team lab, and apply it to a lab box. I would then apply a set of regression tests to validate that the patch/upgrade worked in my environment, with my servers, HBA's, etc.
About a week prior to my upgrade window I would pull together an "upgrade" package where I decide what patches/upgrades, etc I was going to apply to the storage, as well as any that were required for the HBA's, host OS's, etc. Note it's critically that the host HBA's be upgraded to the latest version of their drivers, etc. that are supported by the patches/upgrades that you are going to roll out to avoid issues. Upgrades to the servers are often avoided even more than the storage OS upgrades since they are usually the source of outages (reboot required) and due to the fact that it's not the storage team doing those upgrades in many cases.
I would actually have two windows, once for arrays that support dev/test, and one for arrays that support production if it's possible. I would then roll out the patches to dev/test, and let them bake there for a week or two, and then roll them out to production. This isn't 100% necessary, especially if you've done good testing in your lab, but it would be nice.
Go to step #1 and start the process all over again.

When I've described this process to people I often get push back like "hey, that means that we will constantly either testing, or performing upgrades"! This is especially the case if you decide to go on the quarterly schedule. My response if "yup, because that's part of what a storage team does, and why you have a storage team". Frankly, the team's time is better spent on this, than on, say, doing a lot of LUN allocations which you can automate, and even delegate, once it's automated.

The bottom line is, it's a "pay me now, or pay me later" situation and I would rather do as much of my patching/upgrading in a proactive manner, than in a reactie maner where's an a big emergency, and a big project with a lot of downtime at once.

Joerg's IT Blog