Monday, April 27, 2015

The Data Center of the Future, what does it look like?

Folks,

I've been spending a lot of time talking with customers about storage, flash, HDDs, Hyper-converged, cloud, etc. lately.  What's become clear to me recently, yes, I'm a  little slow, is that all of these technology changes are driving us toward sea changes in the enterprise data center. In this blog posting, I want to talk a little about how things are changing in regards to storage.  I'm going to talk a bit about Flash vs. HDD technology and where I see each of them going in the next few years, and I'll finish up with a discussion on how that will effect the enterprise data center going forward as well the the data center infrastructure industry in general.

I believe that the competition between flash and hard disk-based storage systems will continue to drive developments in both. Flash has the upper hand in performance and benefits from Moore's Law improvements in cost per bit, but has increasing limitations in lifecycle and reliability. Finding well-engineered solutions to these issues will define its progress. Hard disk storage, on the other hand, has cost and capacity on its side. Maintaining those advantages is the primary driver in its roadmap but I see limits to where that will take them.

Hard disk Drives (HDDs)
So, let's start with a discussion of HDDs.  Hard disk developments continue to wring a mixture of increased capacity and either stable or increased performance at lower cost. For example, Seagate introduced a 6TB disk in early 2014 which finessed existing techniques, but subsequently announced an 8TB disk at the end of 2014 based on Shingled Magnetic Recording (SMR). This works by allowing tracks on the disk to overlap each other, eliminating the fallow area previously used to separate them. The greater density this allows is offset by the need to rewrite multiple tracks at once. This slows down some write operations, but for a 25 percent increase in capacity -- and with little need for expensive revamps in manufacturing techniques.

If SMR is commercially successful, then it will speed the adoption of another technique, Two-Dimensional Magnetic Recording (TDMR) signal processing. This becomes necessary when tracks are so thin and/or close together that the read head picks up noise and signals from adjacent tracks when trying to retrieve the wanted data. A number of techniques can solve this, including multiple heads that read portions of multiple tracks simultaneously to let the drive mathematically subtract inter-track interference signals.

A third major improvement in hard disk density is Heat-Assisted Magnetic Recording (HAMR). This uses drives with lasers strapped to their heads, heating up the track just before the data is recorded. This produces smaller, better-defined magnetized areas with less mutual interference. Seagate had promised HAMR drives this year, but now says that 2017 is more likely.

Meanwhile, Hitachi has improved capacity in its top-end drives by filling them with helium. The gas has a much lower viscosity than air, so platters can be packed closer together. This allows for greater density at the drive level.

All these techniques are becoming adopted as previous innovations -- perpendicular rather than longitudinal recording, for example, where bits are stacked up like biscuits in a packet instead of on a plate -- are running out of steam. By combining all of the above ideas, the hard disk industry expects to be able to produce around three or four years of continuous capacity growth while maintaining price differential with flash. However, it should be noted that all of the innovation in HDDs is around capacity. I believe that HDDs will continue to dominate the large capacity, archive type of workloads for the next 2 or 3 years. After that ... well, read the next section on flash.

Some argue that the cloud will be taking over this space. However, even if this is true, cloud providers will continue to need very cheap high capacity HDDs until flash is able to take over this high capacity space as well based on $$/GB.

Flash
Flash memory is changing rapidly, with many innovations moving from small-scale deployment into the mainstream. Companies such as Intel and Samsung are predicting major advances in 3D NAND, where the basic one-transistor-per-cell architecture of flash memory is stacked into three dimensional arrays within a chip.

Intel, in conjunction with its partner Micron, is predicting 48GB per die this year by combining 32-deep 3D NAND with multi-level cells (MLC) that double the storage per transistor. The company says this will create 1TB SSDs that will fit in mobile form factors and be much more competitive with consumer hard disk drives -- still around five times cheaper at that size -- and 10TB enterprise-class SSDs, by 2018. Moore's Law will continue to drive down the cost per TB of flash at the same time as these capacity increases occur this making flash a viable replacement for high capacity HDDs in the next 3 to 5 years. Note that this assumes that SSD's will leverage technology such as deduplication in order to help reduce the footprint of data and drive down cost.

The following is a chart from a Wikibon article on the future of flash:


As you can see from the graph above, by 2017 the 4 year cost per TB of flash will be well below that of HDDs, and that this trend will continue until 2020 when the 4 year cost per TB of flash hits $9 per TB vs $74 per TB for HDDs. You can read the entire article here.

Conclusions
So, what does all this mean?  Among other things, it means that you can expect a shift to what the Wikibon article calls the "Electronic Data Center".  The Electronic Data Center is simply a data center where the mechanical disk drive has been replaced by something like flash, thus eliminating the last of the mechanical devices (they assume tape and tape robots are already gone in their scenario).  This will reduce the electricity and cooling needs, as well as the size/footprint of the data center of the future.

Let's assume for a moment that Wikibon is correct.  What does this mean to the data center infrastructure industry?

  1. Companies that build traditional storage arrays will need to shift their technology to "all flash", and they need to do it quickly.  You can see this already happening in companies such as EMC with the acquisition of XtremIO in order to obtain all flash technology.  Companies like NetApp, on the other hand, are developing their all flash solutions in house. In both those cases, however, all flash solutions are facing internal battles against engineering organizations that are vested in the status quo.  That will mean they could be slow to market with potentially inferior products. However, their sheer size in the market may protect them from complete failure.
  2. What about the raft of new startups producing all flash arrays?  Might the above provide an opening for one or more of those startups to "go big" in the market?  What about the rest? My take on this is that indeed, one or more might have the opportunity to "go big" due to the gap that might be created by the "big boys" moving too slowly or trying to shoe-horn old existing technologies into the data center. Most of them, however, will either die off, or be acquired buy a larger competitor.
However, I think that there is an even larger risk to the "storage only" companies both new and old. I believe that a couple of other market forces will put significant pressure on these "storage only" companies, including the new all flash startups.

Specifically, the trends towards cloud computing, Hyper-converged, and more and more emphasis on automation that is being driven by other IT trends such as DevOps will make standalone storage arrays less and less desirable to IT organizations.  This will force those companies to move beyond their roots into hyper-converged infrastructure, for example where they currently have little or no engineering expertise or management experience.

The companies who are able to embrace these kinds of moves will likely have a bright future in the data center of the future.  However, issues around "not invented here", and lack of engineering talent in the new areas of technology are going to make it a challenge for those very large storage companies going forward. Again, how they address these issues is going to be a determining factor in their future success.

To wrap it it up, I firmly believe that not everything is "moving to the public cloud" in the enterprise space. What I do believe is that:

  1. Some workloads currently running in the enterprise data center will move to the public cloud, and be managed by IT.
  2. Some workloads will remain in "private" clouds owned and operated by IT. However, those private clouds must offer at all of the same ease of use the internal customers that the public could offers. Most likely, they will leverage web-scale architectures (hyper-converged) in order to make management and management automation easier.
  3. Hybrid cloud management software will be used to allow both management, and automation to span between the enterprises private cloud and it's public cloud(s).
  4. DevOps and similar initiatives will drive significant automation into the hybrid clouds I describe above, as well as significant change to IT organizations.
  5. these changes will all be highly disruptive, and those IT organizations that embrace change will have an easier time over the next few years than those that don't. Very large IT organizations will have the hardest time making the changes. Yes, it is hard to turn the aircraft carrier. However, internal customers are demanding it of IT, and will go outside the IT organization to get what the want/need if necessary.
In the end the Data Center of the Future will look very different than the current enterprise data center. It will be a hybrid cloud that spans on-premise, and public clouds. It will be an all electronic data center that uses significantly less footprint, and electricity than current data centers. And finally, this infrastructure will leverage significant automation and be managed by an IT organization that looks very different than the current IT organization.


Wednesday, April 22, 2015

Structured or Unstructured PaaS??

Words, labels, tags, etc. in our industry mean something – at least for a while – and then marketing organizations tend to get involved and use words and labels and tags to best align to their specific agenda. For example, things like “webscale” or “cloud native apps” were strongly associated with the largest web companies (Google, Amazon, Twitter, Facebook, etc.). But over time, those terms got usurped by other groups in an effort to link their technologies to hot trends in the broader markets.

Another one that seems to be shifting is PaaS, or Platform as a Service. It’s sort of a funny acronym to say out loud, and people are starting to wonder about it’s future. But we’re not an industry that likes to stand still, so let’s move things around a little bit. Maybe PaaS is the wrong term, and it really should be “Platform”, since everything in IT should eventually be consumed as a service. I'm already hearing about XaaS (X as a Service) which pretty much means anything as a server, or perhaps everything as a service.

But not everyone believes that a Platform (or PaaS) should be an entirely structured model. There is lots of VC money being pumped into less structured models for delivering a platform, such as Mesosphere, CoreOS, Docker, Hashicorp, Kismatic, Cloud66, Apache Brooklyn (project) and Engine Yard acquiring OpDemand.

I’m not sure if “Structured PaaS” and “Unstructured PaaS” are really the right terms to use for this divergence of thinking about how to deliver a Platform, but they work for me. The Unstructured approach seems to appeal more to the DIY-focused start-ups, while Structured PaaS (eg. Cloud Foundry, OpenShift) seem to appeal more towards Enterprise markets that expect a lot more “structure” in terms of built-in governance, monitoring/logging, and infrastructure services (eg. load-balancing, higher-availability, etc.). The unstructured approach can be built in a variety of configurations, aka “batteries included but removable“, whereas the structured model will incorporate more out-of-the-box elements in a more closely configured model.

Given the inherent application portability that comes with either a container-centric model, or PaaS-centric, both of these are areas that IT professionals and developers should be taking a close look at, especially if they believe in a Hybrid Cloud model – whether that’s Private/Public or Public/Public. It’s also an area that will drive quite a bit of change around the associated operational tools, which are beginning to overlap with the native platform tools for deployment or config management (eg. CF BOSH or Dockerfiles or Vagrant).

It’s difficult to tell at this point which approach will likely gain greater market-share. The traditional money would tend to follow a more structured approach which aligns to Enterprise buying centers. But the unstructured IaaS approach by AWS has led it to a significant market-share lead for developers. Will unstructured history be any indication of the Platform market? Or will too many of those companies struggle to find viable financial models after taking all that VC capital and eventually just be a feature within a broader structured platform?  I want to hear what you think, all respectful comments are welcome.