Backup Per-Job Deduplication Versus…Well…Real Deduplication
Actual Unitrends and REDACTED VMware vSphere 5.5 Storage Deduplication for a Single VM and a Single VMDK
[Preface: Note that the although the data is real, the name of our competitor has been obscured, changed, and otherwise redacted to avoid hurting anyone’s feelings. The original post in this series describes this in more detail. The second post in this series describes HOS vs GOS backup and deduplicaiton. I was also asked if this post referred to only VMware – it actually refers to Microsoft Hyper-V and other virtualization platforms as well.]
Before I begin, a quick note about writing this blog post. The toughest part was the title. I found myself using Google to find euphemisms for “calling bullcrap.” In that vein, let’s cut to the chase.
Per-job deduplication is incredibly, remarkably, inefficient at deduplication. Don’t take my word for it – look at the data in the chart above. The environment in which this is run should be the poster child for HOS- (i.e., hypervisor)-based per-job deduplication – it’s the best-case possible case to demonstrate why per-job deduplication is absolutely great. I am not showing some complicated environment in which “real” deduplication (and by that, I mean either inline or post-processing deduplication that deduplicates across both jobs as well as time) has an advantage – although that would be fair enough to do. Instead, I’m showing the storage footprint for a single VM, with a 5% change rate, being backed up and replicated by REDACTED and by Unitrends. I’ve also run this at the same block size – 1MB – recommended for all the products for virtualized backup.
Read the entire article here, Backup Per-Job Deduplication Versus…Well…Real Deduplication