Backing up VMs seems to be one of the topics nobody can seem to be able to agree on despite a plethora of reading material on the subject – and maybe because of said plethora.
I will focus on VMware since it is the leading and prevalent virtualization method in the marketplace today (I’m sure the KVM, Xen and Hyper-V fanboys will have their 15 minutes of fame someday).
VMware has several ways for backing up VMs:
- Install a backup agent in the VM, just as with a normal client
- Back up the entire VM by installing a backup agent in the ESX console
- Use VCB (VMware Consolidated Backup).
They all have their pros and cons so the short answer to the topic is that there’s no best method, instead you’ll get the “it depends” answer. Sorry. Here’s the skinny on each method:
1. Install a backup agent in the VM, just as with a normal client
- Everyone understands this, since it works just like a real physical client and can do most of the same things
- Can do incrementals
- File-level recovery is straightforward with no confusion as to which VM owns which file
- Advanced backup features such as DB agents work fine
- Impact on the host and network
- Deployment just as difficult as when using the physical clients
- Can make backup software licensing more expensive than needed
- Bare-metal-recovery of VMs only a bit less difficult than with physical boxes
2. Back up the entire VM by installing a backup agent in the ESX console
- Licensing cost for backup software minimized (1 license needed per ESX server)
- The entire VM is backed up so recovery is like Bare Metal Recovery â€“ you’ll get the entire box back with a very high probability of success
- Fast since the virtualization layers are bypassed
- Still significant impact on the host and network
- Cannot restore individual files
- Advanced backup agents won’t work (no hot backups of SQL or Exchange, for instance)
- Backups always large since a full backup is required every time
- Backups take long (see previous point)
- Requires some scripting knowledge to deploy properly.
3. Use VCB (VMware Consolidated Backup).
- Works with most backup software
- Almost no impact on the host or network (backups can be entirely SAN-based)
- Reduced backup software licensing cost
- Works with VSS in windows to provide better backup reliability
- Allows for incremental backups
- Uses VM snapshots
- No disk space used for staging of incrementals
- Very simple DR
- File-level backups are possible
- Cannot back up RDMs in Physical Compatibility Mode
- Advanced functionality (file-level backups and application integration won’t work with non-windows VMs)
- Cannot back up clustered VMs (i.e. MSCS-clustered VMs can’t be backed up)
- FullVM backup speed is limited to 1GB/min (limitation of windows’ cmd.exe but can get around it by creating multiple threads I guess â€“ but you could have speed issues if you cannot break the jobs up and they’re large)
- Significant disk space needed for Holding Tank (where FullVM copies are placed)
- Advanced backup agents will not work
- File-level backups won’t back up the Windows registry
- File-level recovery is complex and generally a two-step process
The lists could go on but as you can see there are serious wrinkles with all the approaches.
The problem is compounded by the fact that most modern backup software has arcane licensing schemes depending on whether an agent is on a VM or not, for instance (CommVault) or allowing you unlimited agents per ESX server as long as you buy the more expensive client license for the ESX server (NetBackup), and various permutations thereof.
Another wrinkle is Deduplication. Products that do source-based Deduplication such as EMC’s Avamar can comfortably have their agents inside the VMs or in the service console since subsequent backups take only a fraction of the time and there’s almost no space penalty. So, with Avamar one could be doing both kinds of backup (entire VM and individual files) and be covered both ways and only worrying about time and space when reading Hawking’s booksâ€¦ The negative is cost.
NetBackup offers another interesting twist since their implementation of VCB allows individual files to be recovered from a FullVM backup â€“ the rationale being that you use their PureDisk Deduplication to store everything in order to reduce the expense of backup disk.
In the end, the only recommendation I can give that doesn’t depend too much on your individual circumstances is to try and do both file-level and FullVM-type backup so that you’re covered in multiple ways. Then replicate those backups, etc – you know the drill by now.