Monday, May 12, 2014

Modernizing Your Backups

This week, I'd like to spend a little time talking about backup modernization, or as I prefer to call it, data protection modernization. The process we use for traditional backups hasn't really changed much in 20 or 30 years.  We do a full backup once per week, and take some kind of incremental backup of our data every day in between. These backups are aways copied to some other storage mechanism like tape, or now disk and a retention is attached to the backups that defines how long we need to keep that backup.  Those retentions are important, since they define things like how much dedicated backup disk we need, or how many tapes we need to have on hand, etc.  They also play an important roll later on when/if we decide to change the way we do backups.

But first, let's talk about the fact that traditional backup processes are really beginning to become more and more problematic.  Why?  There are actually a number of reasons. First, and perhaps the most obvious, are that data sets are becoming lager and larger every day.  This means that either the backups are taking longer and longer to complete, or,  more and more backup infrastructure needs to be put in place.  Dedicated 10GbE connections, backup to disk, more and more and faster and fast tape drives, all need to be put into place to keep up. Yet it's a losing battle. The data sets just keep getting bigger. For example having a NAS array today that holds a petabyte of data isn't terribly unusual like it was not all that long ago.  These bigger and bigger data sets are now benign to outstrip the ability of the storage system to send data to the backup system in a timely manner. Things like NDMP are just not able to keep up with these very large data sets. So data set size is certainly one of the more pressing reasons that people are beginning to look into modernizing their backups.

Another reason that people are bringing to look at modernizing their backup processes is that backup windows are getting smaller and smaller, and in some cases, closing completely. Back in the day, we had all night to run backups. Yes, of course we had to dodge in-between the batch jobs, that that was easy enough to do when  you had 12 or more hours to do that.  Those days are pretty much over. Today you are luck to get any time at all to backup the data, and as I said above, in some cases, you really don't have a window at all.

Finally, Recovery Time Objectives (RTO's) are getting shorter and shorter and the Recovery Point Objectives are getting smaller and smaller.  What this means for the backup administrator is that they must take more backups, and must be able to retire form those backups more quickly.

So, what to do?  The first step that many of my customers have taken s to start to include snapshots are part of the backup process.  This addressees the issue of RTOs and RPOs since you can take those snapshots quickly, and you can recover from them quickly.  You can also take multiple snapshots per day, so you have a much more fine grained ability to recover that data to a particular point in time. However, must people continue to do their regular backups as well, based on the premiss that snapshots aren't backups since they don't make a full copy of the data to another storage medium. However, for some customers it's beginning to become so problematic to do those tradition fills and incrementals, they are revisiting this position. Specifically, if they were to have a problem with their storage array, such that they lost data, and couldn't recover from a snapshot, isn't that the definition of a disaster in the data center?  if you accept that premise, then you can start to consider a combination of snapshots, and say data replication for disaster recover, as a fixable, complete backup solution and drop traditional backups entirely.

A move to nothing but snapshots and replication as your data protection mechanism solves a number of issues.  It address the ever growing backup infrastructure, for example, by leveraging space you already have on your storage array, and a DR plan (replication) you may very well already have in place. Admittedly, for some longer retentions it might mean you need a bit more disk space in your array, but because of the nature of snapshots it's probably the same or less space than you would need for disk based backups. If you are already backing up to an external backup to disk array like a Data Domain, you can repurpose your DD budget and add the space you need to your storage array to hold all of the snapshots you need/want.

Another method now bringing to become popular to modernize your backups is to leverage change block tracking.  This is a mechanism in which the backup application, the storage array, or the hypervisor keeps track of the specific blocks that have changed,  and the backup application only "backs up" these changed blocks.  This can reduce the amount of backup traffic from the storage array to the backup infrastructure significantly, thus addressing the issue of the ever growing backup data sets.  If you couple this with CDP (Continuous Data Protection) or near CDP functionality, it will also address the RPO issues, and since recovery from this kind of backup often means sending less data back to the storage array/application it can also address the RTO issues.

However, since you are probably already do some kind of backup, most like a traditional backup, the question becomes, how do I get from my current traditional backups to one of these more modern backup techniques?  While on the surface it may seem simple enough, there are a number of issues to consider. First,  you need to consider your existing backups. Those backups have a retention, and so you need to keep you existing backup software/mechanism in place, at least until the retentions on those existing backups have expired. One question that often crops up in this regard is what if I have backups with very long retentions, like 7 years?  Does this mean I need to keep my existing backup mechanism in place for 7 years?Well, that's certainly one way to handle the problem.  One way to mitigate the issue a little if you can, is to PTV your existing backup servers once you've switch all you backups to the new method.  You can then shut down those VM's, and only spin them up if you need to get back at that old data for some reason.  Another way to address the issue to to recognize that backups with long retentions are often not backups at all, they are actually archives, and they probably shouldn't have been backups in the first place. This is the perfect opportunity to start a dialog with your customers about the difference between backup and archive, and getting an archive mechanism in place to handle that data. The difference between archive and backup is a topic near and dear to my heart, but it's also beyond the scope of this posting. Just keep this in mind when you go to do your backup modernization planning.

The other issue to that you should consider when planning to modernize your backups is management.  Much of the utility of today's backup software such as CommVault, NetBackup, and TSM is around managing the backups.  Scheduling them, monitoring that they complete successfully,  and reporting on them both from a administrative perspective, but also up the management tree and to your customers so that everyone is assured that their data is protected.  Many people think that moving to a new more modern backup process means getting rid of these tried and true software programs. However, these may be an advantage to keeping them in place.  For example, that reportage mechanism that is so important to your business then also stays in place.  Considering that many snapshots, for example, are managed by software provided by the array manufacturer,  and often only manage the snapshots on once array at a time, you could end up in a situation where your backups are modernized, but your backup management has taken a step back in time. this is also true if you bring on several deterrent techniques to backup you data.  For example, I know of customers who use snapshots and replication for the databases, and then use something like Veeam to backup their virtual infrastructure.  This has the potential to create an even bigger management/administrative/reporting headache.

So, if you can leverage your current backup software   to manage your snapshots, and/or perform CDP like functions via change block tracking, then I believe that you've hit on the best of both worlds.  The good news is, that most of the backup software vendors are recognized this, and are moving aggressively to add these kinds of features into their products. Admittedly, some are further ahead in some areas than others, but it's not like you have to change overnight, so implementing the features as they appear in your backup software isn't necessarily a bad thing.

2 comments:

huin said...

Can you explain here briefly somewhat about the Sitelink self storage software? Make sure that would be relevant to kinds of basic information of computers or software.

rackmountpro said...

a good articlestorage server