A word of caution when setting up a deduplicating VTL

Based on some recent experiences I wanted to make people aware of some caveats with setting up a VTL with deduplication. This is specifically regarding the EMC DL3D (AKA Quantum DXi) but applies to all of them. This will be a mercifully short and to the point post. Here’s the rub:

  • Create small virtual tapes (100GB max, I’d go even smaller, obviously depends on your environment)
  • Create a bunch of virtual tape drives (you might have to create 20-30!)
  • Do NOT I repeat NOT multiplex in the backup software! It screws up the deduplication algorithm.
  • Do not compress the data before the backup
  • Do not encrypt the data
  • Be mindful of your retention policies, start gently then work your way up.
  • I’d personally not multi-stream a server at all, just so I can keep the tape utilization high. What I mean: Say you do not do multiplexing but you are multistreaming – i.e. you’re sending 10 streams from your client. This means you will need 10 tapes without multiplexing, so you’ll end up writing a tiny bit on each tape. It doesn’t take a genius to realize that you’ll end up with a ton of tapes with not much data on them, which will cause them to be appended to with more tiny amounts of data, which will in turn cause them to expire way later than you’d like.
  • If you can use the box as NAS and know how to get the throughput up there then do so, that way there’s no issue with multiple streams. My Data Domain boys are chuckling now (they always prefer to do NAS, but that also has to do with the fact that their box can’t really do VTL properly yet. Oh, the cattiness! BTW my company does sell quite a lot of their stuff).

The same rules apply otherwise as in my previous post about tuning NetBackup for large environments.

Regarding using the DL3D/DXi as NAS: Plug in as many GigE ports as you can, but make sure your switch can do straight-up EtherChannel (not LACP). So you pretty much need to have a “proper” Cisco switch in order to get the full benefit. Then use multiple media servers. Use a separate NAS share per media server. Team the NICs on the backup servers for performance (do LACP or PaGP there, whatever works with the server’s NIC software). Then call me in the morning.



Leave a comment for posterity...