Veeam Backup with Windows Deduplication Benchmark

Some people wonder why anyone would use Windows Deduplication on a Veeam Backup Repository.   Doesn’t Veeam have its own deduplication and replication?

The people at Veeam actually recommend using Windows Deduplication and have a great writeup about it that you can download here.

Veeam has great deduplication and replication, but it is within a single backup.  Veeam deduplicates the backup of one VM against another VM in the same backup.

Windows Deduplication deduplicates blocks of data across many backups.  For example, when I deduplicated a full Veeam backup of multiple VMs a second time, 800 GB of backup deduplicated down to 10 GB.  That means that using Windows Deduplication on the Veeam Repository reduces my full replication over the Internet from 800 GB to 10 GB, on the second and following full backups.

Of course most people using Veeam are going to do full backups periodically, but incremental backups one or more times each day.

My first incremental backup with Veeam takes about 20 GB of repository storage.  Deduplicating that with Windows Deduplication takes it down to about 2 Gb of disk usage.  This means I can protect my 800 GB of VMs using 2 GB of actual disk storage and replicate it in just a few minutes.

My Veeam repository testbed is actually on a Dell business desktop for price and performance reasons I explain elsewhere on this blog.

Del 7020 i7 4790 3.4ghz  16GB 1600 speed DDR3 Small Form Factor SFF  (price about $850 12/2013)

Probox 4x USB 3.0 4 drive enclosure ($99)

with 4 WD RED NAS 4TB drives (about $600 3/2015)

Windows Server 2012 R2 (your price will vary up to $800)

The Veeam backup time, inlcuding sending it across 1 gigabit network, was about an hour and a half for the first full backup.

Windows deduplication  of 836,809,182,740 bytes
Elapsed time is 8963 seconds
93,362,622 bytes per second
336,105,439,904 bytes per hour

The first time you run windows deduplication on a backup file, much of the time is used in the compression of the chunks.  Therefore the deduplication time is likely to be longer than your daily deduplication of second and following backups.

Windows reports the dedupe status on the volume containing the Veeam Backup Repository.

Volume : D:
Capacity : 7.27 TB
FreeSpace : 6.98 TB
UsedSpace : 304.06 GB
UnoptimizedSize : 785.56 GB
SavedSpace : 481.5 GB
SavingsRate : 61 %
OptimizedFilesCount : 2
OptimizedFilesSize : 779.34 GB
OptimizedFilesSavingsRate : 61 %

My second Veeam Backup is a forward incremental backup.  This is what Veeam suggests you use when storing backups on a windows deduplicated volume.

The entire backup ran in 6 minutes and 47 seconds.

26,859,267,777 bytes
261 seconds
102,909,071 bytes per second

Windows backup is ingesting at 370,472,655,600 bytes per hour.  Microsoft says it will only go 100 GB an hour.  I did juice it up a bit by running the process in high priority.

But wait, there’s more! (As they say on TV).  The next full backup will be faster.

Meanwhile, let’s look at the volume usage.

Volume : D:
Capacity : 7.27 TB
FreeSpace : 6.97 TB
UsedSpace : 305.54 GB
UnoptimizedSize : 810.63 GB
SavedSpace : 505.09 GB
SavingsRate : 62 %
OptimizedFilesCount : 3
OptimizedFilesSize : 804.35 GB

That’s nice – 26 billion bytes of backup in less than 2 GB of disk space.

I ran the incremental again and the size transferred was a lot smaller.

Volume: D:

Job processed space (bytes): 3,837,792,131
Job elapsed time (seconds): 81
Job throughput (MB/second): 45.18

The throughput doesn’t look as good, but it takes time to just start and stop the program, and the whole run was less than a minute and a half.

Volume : D:

Capacity : 7.27 TB
FreeSpace : 6.97 TB
UsedSpace : 306.17 GB
UnoptimizedSize : 814.22 GB
SavedSpace : 508.06 GB
SavingsRate : 62 %

My real concern in doing all this is how long it will take to replicate Veeam backups over the Internet.  I don’t want to be schlepping tapes all over the place.

I have the impression that a lot of Veeam users are only doing full backups once a month or even less.  But I am an old school kind of guy and I really want to be able to do a full backup once a week.  With Veeam deduplication that would still be a lot of data, I think.  But what about with Windows deduplication?

This time I chose ‘Active Full’ backup on Veeam.

Once again the backup to the repository took about an hour and a half.

But look at the windows deduplication processing:

837,381,046,512 bytes processed

5169 seconds
162,000,589 bytes per second
583,202,121,773 bytes per hour
837,381,046,512 bytes of backup used 10 GB of new disk space

Volume : D:

Capacity : 7.27 TB
FreeSpace : 6.96 TB
UsedSpace : 316.07 GB
UnoptimizedSize : 1.56 TB
SavedSpace : 1.25 TB

583 billion bytes an hour! 

While the overall deduplication rate of the volume is not that high yet, the new full backup used up 10 GB for almost 800 GB of new data.  If this rate holds, I should be able to store about 500 full Veeam backups on this volume.

Of course, my original plan was to store incremental forwards with a full backup once a week.  I will probably still do that, but I could also just do full backups for a long long time.

(Shameless product plug) What is going to make this nice for me is using Replacador to replicate the windows deduplication volume offsite and to the cloud.  The windows deduplication cuts the backup size down so much that I can replicate a days backups in under ten minutes and a full backup in an hour or so.

How does this scale?  If you are doing anything up to about 10 TB of Veeam full backup once a week, with incrementals the rest of the time, you could process it with this system.  You would want to put in 8 TB hard drives, I expect.  I’m running these RAID 10 through Windows file services on USB 3.0.

Of course a “real” server would have faster hard drives, and if you use faster memory and a sufficiently fast processor you might go even quicker than this.   We will be testing windows deduplication on a new Dell 530 with 2133 memory soon and hope to bring you even better numbers.

According to our testing, the single core speed and the memory speed are the most important factors in the windows deduplication ingestion.

Windows deduplication can add a lot of value to the Veeam backup process.  It allows you to store more backups in less space, and replicate them in far less time, than with Veeam alone.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>