So, what’s the best way to back up VMs?

Backing up VMs seems to be one of the topics nobody can seem to be able to agree on despite a plethora of reading material on the subject… and maybe because of said plethora.

I will focus on VMware since it is the leading and prevalent virtualization method in the marketplace today (I’m sure the KVM, Xen and Hyper-V fanboys will have their 15 minutes of fame someday).

VMware has several ways for backing up VMs:

  1. Install a backup agent in the VM, just as with a normal client
  2. Back up the entire VM by installing a backup agent in the ESX console
  3. Use VCB (VMware Consolidated Backup).

     

They all have their pros and cons so the short answer to the topic is that there’s no best method, instead you’ll get the “it depends” answer. Sorry. Here’s the skinny on each method:

 

1. Install a backup agent in the VM, just as with a normal client

 

Pros:

  • Everyone understands this, since it works just like a real physical client and can do most of the same things
  • Can do incrementals
  • File-level recovery is straightforward with no confusion as to which VM owns which file
  • Advanced backup features such as DB agents work fine

 

Cons:

  • Impact on the host and network
  • Deployment just as difficult as when using the physical clients
  • Can make backup software licensing more expensive than needed
  • Bare-metal-recovery of VMs only a bit less difficult than with physical boxes

 

2. Back up the entire VM by installing a backup agent in the ESX console

 

Pros:

  • Licensing cost for backup software minimized (1 license needed per ESX server)
  • The entire VM is backed up so recovery is like Bare Metal Recovery – you’ll get the entire box back with a very high probability of success
  • Fast since the virtualization layers are bypassed

 

Cons:

  • Still significant impact on the host and network
  • Cannot restore individual files
  • Advanced backup agents won’t work (no hot backups of SQL or Exchange, for instance)
  • Backups always large since a full backup is required every time
  • Backups take long (see previous point)
  • Requires some scripting knowledge to deploy properly.

 

3. Use VCB (VMware Consolidated Backup).

 

Pros:

  • Works with most backup software
  • Almost no impact on the host or network (backups can be entirely SAN-based)
  • Reduced backup software licensing cost
  • Works with VSS in windows to provide better backup reliability
  • Allows for incremental backups
  • Uses VM snapshots
  • No disk space used for staging of incrementals
  • Very simple DR
  • File-level backups are possible

 

Cons:

  • Cannot back up RDMs in Physical Compatibility Mode
  • Advanced functionality (file-level backups and application integration won’t work with non-windows VMs)
  • Cannot back up clustered VMs (i.e. MSCS-clustered VMs can’t be backed up)
  • FullVM backup speed is limited to 1GB/min (limitation of windows’ cmd.exe but can get around it by creating multiple threads I guess – but you could have speed issues if you cannot break the jobs up and they’re large)
  • Significant disk space needed for Holding Tank (where FullVM copies are placed)
  • Advanced backup agents will not work
  • File-level backups won’t back up the Windows registry
  • File-level recovery is complex and generally a two-step process

 

The lists could go on but as you can see there are serious wrinkles with all the approaches.

The problem is compounded by the fact that most modern backup software has arcane licensing schemes depending on whether an agent is on a VM or not, for instance (CommVault) or allowing you unlimited agents per ESX server as long as you buy the more expensive client license for the ESX server (NetBackup), and various permutations thereof.

Another wrinkle is Deduplication. Products that do source-based Deduplication such as EMC’s Avamar can comfortably have their agents inside the VMs or in the service console since subsequent backups take only a fraction of the time and there’s almost no space penalty. So, with Avamar one could be doing both kinds of backup (entire VM and individual files) and be covered both ways and only worrying about time and space when reading Hawking’s books… The negative is cost.

NetBackup offers another interesting twist since their implementation of VCB allows individual files to be recovered from a FullVM backup – the rationale being that you use their PureDisk Deduplication to store everything in order to reduce the expense of backup disk.

In the end, the only recommendation I can give that doesn’t depend too much on your individual circumstances is to try and do both file-level and FullVM-type backup so that you’re covered in multiple ways. Then replicate those backups, etc… you know the drill by now.

D

 

 

2 Replies to “So, what’s the best way to back up VMs?”

  1. This is a nice summary of the pros and cons. There’s one other option (of many) that I think is worth mentioning. Image based backups like Acronis or even Ghost.

    This type of backup could fall into the agent in the VM category but, it typically is faster(how do they do it so FAST?!?!) than the usual B2D or B2T agents and it allows both a whole disk restoration as well as individual file restoration. There are still problems with this method as well, especially cost.

    I’m curious as to your opinion on these types of VM backups.

  2. the solution we have deployed, esxpress, seems to overcome several of the issues you mentioned for console based backups:

    * Still significant impact on the host and network
    -> this remains true, but each backup runs as a seperate VM, so you don’t bog down the console’s CPU. as such you can all physical cpus in the machine if you wish to.

    * Cannot restore individual files
    -> is can with an additional license, but we have not had the need for this.

    * Advanced backup agents won’t work (no hot backups of SQL or Exchange, for instance)
    -> haven’t checked this out yet

    * Backups always large since a full backup is required every time
    -> it can do delta backups, so this is mitigated as well

    * Backups take long (see previous point)
    -> currently doing 70 backups (average size of 10GB/VM) in less than on hour

    * Requires some scripting knowledge to deploy properly.
    -> a bit to open ended to really go into.

Leave a comment for posterity...