Replicating a Veeam Backup Repository with Windows Deduplication

I have already discussed the advantages of using Windows Deduplication for Veeam Backup Repositories, and introduced the idea of using Replacador to replicate a windows deduplication volume.

Now we will set up Replacador on a windows deduplication volume that holds a Veeam Backup Repository and then run the replication process.

First browse to C:\LaserVault\Replacador and execute the ReplacadorConfig application.

We will use this to define a replication task for our source and destination volumes.

In this first example, we will replicate the volume to another volume on an external drive on the same server.  The same process also runs across a network or the Internet.  We will give an example of that later.

In the Replication Configuration Screen, press the Add Definition button and define a replication task.

In this case, we make the task name vmbackup, the machine name is ‘.’  ( a period means the current machine).

The volume Path is D:\  Normally a deduplication volume will be a drive letter on the current machine.


Don’t click OK yet.

Each replication needs at least one Destination. You can actually replicate to multiple destinations at once.

Click the Add Destination button and a new form opens to define the destination.

In this case we are replicating to another volume on the same server, so the Machine Name is ‘.’ for the local server again.

In the case of a different network location, this could be the VNC name of the target machine, or its IP address.

The username is the local username on the target machine plus a password.  In this case we are using administrator, but you could use the system account or whatever is appropriate.  The user needs to have sufficient authority to run Windows Deduplication garbage collection on the target machine.

The volume path is the local pathname on the target machine to the deduplication volume that will be a clone of the source deduplication volume.

The UNC path is the UNC version of this target deduplication volume, consisting of \\machinename\\volume

In the case where the target is on this local machine, just use the volume path again.


Now click OK, then OK, then OK, and your replication is configured.

The Replacador Manual explains how to run the Replacador Transfer program from the command line or thru the task scheduler, but since we are just testing, we will just click to execute the ReplacadorTransfer application.  Since there is only one replication task defined and the default action of the application is to replicate, it will do exactly what we want.

When you first start the Replacador Transfer, it looks like nothing is happening.  Actually Replacador transfer starts a deduplication job on the source volume to make sure that everything is ready for replication.  If you have already deduplicated the volume, this part of the task will just take a minute or two.

Once the source volume is deduplicated, a command window will open and display the replication progress.


You can get a better idea what is going one by looking at the Task Manager performance screen.


Replication is really just a specialized copy task that takes very few CPU or memory resources.  The limiting factor is the speed of reading, transmitting, and writing the data on the target volume.

The whole point to replication is to reduce the data traffic to the minimum needed to move the changes from the source volume to the destination volume.  The first replication will be a large one, which is why it is sometimes a good idea to replicate to an external drive to seed the actual target server volume.  After that, the replication process should be a small fraction of the original deduplication volume content, even for a new full backup.

When Replacador is done, you will have an exact copy of the deduplicated volume on the target server.  Each time you replicate in the future, only the changed chunks and reparse points on the deduplicated volume will be sent to the target volume.  The original files will not be reflated at any time in the process.


Replication for Windows 2012 R2 Deduplication with Replacador

Windows Deduplication is a free feature in Windows Server 2012 and Server 2012 R2.  It works great and I recommend it.

I’ve  also worked with other deduplication systems, including Data Domain, Avamar, ExeGrid, GreenBytes, Opensolaris and Nexenta.

Deduplication works especially well for backup files. With a deduplication system, you can store many backups on one deduplication appliance, because deduplication only stores each unique chunk of data once.

Besides the obvious advantage of taking up much less disk space for each new backup, deduplication reduces the amount of data transmission you need in order to replicate the deduplicated backup across the network or Internet to an offsite location.

The reduction in data seems magical when you first encounter it.  It actually makes it possible to replicate a backup in a reasonable amount of time with the Internet connection you already have, for most people.

The one problem with using Windows Deduplication instead of another backup appliance is that it does not have replication built into it.

We decided to do something about this, so we wrote Replacador as a replication system for Windows Deduplication.

Replacador looks at two Windows Deduplication volumes, and keeps them synchronized.  First, the source volume is optimized to turn all in policy files into deduplication chunks and reparse points. All the new or changed chunks on the source volume are copied to the destination volume, along with the reparse points. Anything that has been deleted from the source volume is deleted from the destination volume.

Finally, garbage collection is run on both source and destination volumes to keep them synchronized.

The source volume has to be a locally attached volume on the source system.  The destination volume can be a second volume on the source system, such as an external drive.  Usually it is a volume on another server that is on the network or the internet.  Both servers must be running the same version of Windows 2012.  We suggest that they should be running release R2 because the deduplication process is much faster on R2.

The volumes do not have to be the same size, but you will probably want them to be.  The second volume has to have enough room for all the chunks, the reparse points, and some extra room for garbage collection to run.

The reason we support replicating to an external drive is to make it easy to ‘seed’ a new remote deduplication volume when you first start deduplicating.  You can replicate to an external drive on the source system, carry it to the remote system, and replicate from the external drive to the destination drive one time.  This can save days or even weeks of data transmission, in some cases.

This also make it possible to replicate back from the destination volume to an external drive to quickly restore to a source system in case of catastrophic loss of the first volume, due to tornado, fire, flood, etc.

An external disk drive can also be used as a source volume for deduplication.  Some external drives support RAID data protection.  A good example of this is the Western Digital Duo series.  The Duo 8TB costs less than $350 and provides 4 TB of protected storage.  There is also a 12 TB Duo with 6Tb of available storage.

There is a beta test version of Replicator available.  You will also need an authorization code to use Replacador.

Here is the PDF of the documentation.

Replacador Configuration and Use

To get an activation code, browse to c:\LaserVault\Replacador and click the Replacador Configuration application.

Click the Authorization button on the lower left.


Replacador generates a unique serial number for your server.  Copy the contents of the Serial Number and paste it into an email and send it to

We will send you a code good for 30 days.



When you receive the code, paste it into the Authorization Key and click OK.

Next, setup and run Replacador 

Veeam Backup Repository Settings for Windows Deduplication 2012 R2

To use Windows Deduplication with Veeam Enterprise Plus, you will most likely want to use a real Windows Server and create a deduplication-enabled volume for your Veeam backups.  Your Veeam backups will be stored in a Veeam Backup Repository, which is a folder holding all the files.

Windows deduplication ingestion is a CPU and memory intensive procedure and it is probably best not to run it in a VM.

For the same reason, it is best to run windows deduplication on a server that is not being used in production.

On another blog post, we show you how to roll your own deduplication appliance.

You can have multiple Veeam backup repositories on the same deduplicated volume.  For example, if you have three different Hyper-V servers, each with its own collection of VMs, you could have three repositories on one deduplicated volume.

You can also have multiple volumes with deduplication enabled on the same Windows server.  You might want to do this because the different Hyper-V or VMWare hosts have too many VMs for the volume you are using, or because the VMs are very different from each other and won’t deduplicate as well as if you organize them on separate volumes, with all the Linux MySQL VMs on one volume for example.

In this example, we are working with two different servers.  Server H is the host, which is running Windows 2012 R2 with Hyper-V and is hosting multiple VMs.  It has Veeam Enterprise Plus installed on it for backup.

Server R is the Repository server.  This is where Veeam will remotely install the Veeam Backup Repository agent and NFS.

You will be doing all your typing and viewing on Server H, while Veeam will install its software across the network on Server R.

Within Veeam Enterprise Plus, click on Backup Repositories, then right click and add new repository.

Click on Microsoft Windows Server, then Next.

2015-03-13 15_19_55-Edit Backup Repository

Put in the ip address or network name of the server.

At this point Veeam will ask you for the Username and Password to use on the repository server.


Browse to or create the folder name for the backup repository.


Set the Storage Compatibility Settings for deduplication.

2015-03-13 15_20_08-Storage Compatibility Settings

These are the best settings for a backup repository that will be on a volume with Windows Deduplication enabled.

The benchmark shown on this blog was run with these settings.

Veeam will ask to install its own NFS. OK this with these settings.


Veeam will do some things to install the repository software and NFS on the replication server R.


Now go back to the backup job you have set up, or create a new job, and point it to your new repository.

Run the backup job. When it is complete, go to the repository server and run Windows deduplication, or use Replacador to do this.

The first deduplication will not run as fast as your second and following deduplications.

Veeam Backup with Windows Deduplication Benchmark

Some people wonder why anyone would use Windows Deduplication on a Veeam Backup Repository.   Doesn’t Veeam have its own deduplication and replication?

The people at Veeam actually recommend using Windows Deduplication and have a great writeup about it that you can download here.

Veeam has great deduplication and replication, but it is within a single backup.  Veeam deduplicates the backup of one VM against another VM in the same backup.

Windows Deduplication deduplicates blocks of data across many backups.  For example, when I deduplicated a full Veeam backup of multiple VMs a second time, 800 GB of backup deduplicated down to 10 GB.  That means that using Windows Deduplication on the Veeam Repository reduces my full replication over the Internet from 800 GB to 10 GB, on the second and following full backups.

Of course most people using Veeam are going to do full backups periodically, but incremental backups one or more times each day.

My first incremental backup with Veeam takes about 20 GB of repository storage.  Deduplicating that with Windows Deduplication takes it down to about 2 Gb of disk usage.  This means I can protect my 800 GB of VMs using 2 GB of actual disk storage and replicate it in just a few minutes.

My Veeam repository testbed is actually on a Dell business desktop for price and performance reasons I explain elsewhere on this blog.

Del 7020 i7 4790 3.4ghz  16GB 1600 speed DDR3 Small Form Factor SFF  (price about $850 12/2013)

Probox 4x USB 3.0 4 drive enclosure ($99)

with 4 WD RED NAS 4TB drives (about $600 3/2015)

Windows Server 2012 R2 (your price will vary up to $800)

The Veeam backup time, inlcuding sending it across 1 gigabit network, was about an hour and a half for the first full backup.

Windows deduplication  of 836,809,182,740 bytes
Elapsed time is 8963 seconds
93,362,622 bytes per second
336,105,439,904 bytes per hour

The first time you run windows deduplication on a backup file, much of the time is used in the compression of the chunks.  Therefore the deduplication time is likely to be longer than your daily deduplication of second and following backups.

Windows reports the dedupe status on the volume containing the Veeam Backup Repository.

Volume : D:
Capacity : 7.27 TB
FreeSpace : 6.98 TB
UsedSpace : 304.06 GB
UnoptimizedSize : 785.56 GB
SavedSpace : 481.5 GB
SavingsRate : 61 %
OptimizedFilesCount : 2
OptimizedFilesSize : 779.34 GB
OptimizedFilesSavingsRate : 61 %

My second Veeam Backup is a forward incremental backup.  This is what Veeam suggests you use when storing backups on a windows deduplicated volume.

The entire backup ran in 6 minutes and 47 seconds.

26,859,267,777 bytes
261 seconds
102,909,071 bytes per second

Windows backup is ingesting at 370,472,655,600 bytes per hour.  Microsoft says it will only go 100 GB an hour.  I did juice it up a bit by running the process in high priority.

But wait, there’s more! (As they say on TV).  The next full backup will be faster.

Meanwhile, let’s look at the volume usage.

Volume : D:
Capacity : 7.27 TB
FreeSpace : 6.97 TB
UsedSpace : 305.54 GB
UnoptimizedSize : 810.63 GB
SavedSpace : 505.09 GB
SavingsRate : 62 %
OptimizedFilesCount : 3
OptimizedFilesSize : 804.35 GB

That’s nice – 26 billion bytes of backup in less than 2 GB of disk space.

I ran the incremental again and the size transferred was a lot smaller.

Volume: D:

Job processed space (bytes): 3,837,792,131
Job elapsed time (seconds): 81
Job throughput (MB/second): 45.18

The throughput doesn’t look as good, but it takes time to just start and stop the program, and the whole run was less than a minute and a half.

Volume : D:

Capacity : 7.27 TB
FreeSpace : 6.97 TB
UsedSpace : 306.17 GB
UnoptimizedSize : 814.22 GB
SavedSpace : 508.06 GB
SavingsRate : 62 %

My real concern in doing all this is how long it will take to replicate Veeam backups over the Internet.  I don’t want to be schlepping tapes all over the place.

I have the impression that a lot of Veeam users are only doing full backups once a month or even less.  But I am an old school kind of guy and I really want to be able to do a full backup once a week.  With Veeam deduplication that would still be a lot of data, I think.  But what about with Windows deduplication?

This time I chose ‘Active Full’ backup on Veeam.

Once again the backup to the repository took about an hour and a half.

But look at the windows deduplication processing:

837,381,046,512 bytes processed

5169 seconds
162,000,589 bytes per second
583,202,121,773 bytes per hour
837,381,046,512 bytes of backup used 10 GB of new disk space

Volume : D:

Capacity : 7.27 TB
FreeSpace : 6.96 TB
UsedSpace : 316.07 GB
UnoptimizedSize : 1.56 TB
SavedSpace : 1.25 TB

583 billion bytes an hour! 

While the overall deduplication rate of the volume is not that high yet, the new full backup used up 10 GB for almost 800 GB of new data.  If this rate holds, I should be able to store about 500 full Veeam backups on this volume.

Of course, my original plan was to store incremental forwards with a full backup once a week.  I will probably still do that, but I could also just do full backups for a long long time.

(Shameless product plug) What is going to make this nice for me is using Replacador to replicate the windows deduplication volume offsite and to the cloud.  The windows deduplication cuts the backup size down so much that I can replicate a days backups in under ten minutes and a full backup in an hour or so.

How does this scale?  If you are doing anything up to about 10 TB of Veeam full backup once a week, with incrementals the rest of the time, you could process it with this system.  You would want to put in 8 TB hard drives, I expect.  I’m running these RAID 10 through Windows file services on USB 3.0.

Of course a “real” server would have faster hard drives, and if you use faster memory and a sufficiently fast processor you might go even quicker than this.   We will be testing windows deduplication on a new Dell 530 with 2133 memory soon and hope to bring you even better numbers.

According to our testing, the single core speed and the memory speed are the most important factors in the windows deduplication ingestion.

Windows deduplication can add a lot of value to the Veeam backup process.  It allows you to store more backups in less space, and replicate them in far less time, than with Veeam alone.


Backing up with Veeam to your Windows 2012 R2 Deduplication Appliance

If you are following along with our idea of making your own deduplication appliance, you might be interested in using it as one or more Veeam Backup Repositories.

Veeam has some postings about using Windows Deduplication with Veeam backups, and they seem to think it is a great idea.

I do too, and I think our Replacador replication for Windows Deduplication makes things even better.

We have been backing up our own Windows discrete servers for years with plain old Windows Backup.  As we gradually migrated our servers to our first big Hyper-V server (called Borg1 for some reason) we kept using Windows Backup doing full backups every night, deduplicating, and replicating.

When I first heard of Veeam it was in reference to deduplication.  When I read about Veeams positive attitude towards Windows Deduplication, I became even more interested.  So we decided to install Veeam Enterprise Plus trial edition.

I set it up on our Borg1 server and defined the backup job for most of our VMs.  I skipped our document management VM for now because I’m impatient to run the tests faster and that data doesn’t change much.  I will add that in later for production.

I set up a Veeam Backup Repository on one of our UBD servers running Windows 2012 R2 with a deduplication volume. Actually I just twiddled my thumbs while Veeam did all the work.  I did get to make some important decisions about the settings for the Backup Repository, which I will share with you in a future post.

I set the system up for forward incremental backups with a full backup once a week.

The first Veeam backup took about an hour and a half and moved 800 GB of data across the network.  I ran Windows Deduplication on the volume and it compressed and deduped about 60%.  The deduplication job ran in a couple of hours, and of course was mostly compression for the first day.

I couldn’t wait a whole day to do another Veeam backup, so I did the same backup again after a couple of hours.  This was over the same 800 GB set of VMs.

The Veeam backup ran in 6 minutes and 47 seconds. It sent about 26 billion bytes of data to the Backup Repository.

Windows Deduplication ran in 15 minutes or so. The 26 GB of data on disk became 1 GB of deduplicated data. Over our 30 Mbit per second Internet upload we will be replicating in about six minutes.

According to what I have read about Veeam, it is deduplicating within a single backup, across VMs. What Windows Deduplication is adding is the deduplication across multiple backups. This means even a periodic full backup will take up very little space on the deduplicated volume and very little replication bandwidth.

The only replication job that should be somewhat large is the very first one, and our Replacador software supports replication to an external drive, which means you can seed the replication to a drive then send it to your DR site for immediate protection.

Every time I think of Veeam, I am saying WOW.  What an incredible product. If you haven’t tried it yet, spend an hour or two and set it up. And smile.

I am going to publish the settings and statistics for my first Veeam backup jobs in another post.



Roll Your Own Deduplication Appliance with Windows Server 2012 R2

We have been doing a lot of testing and implementation of Windows Deduplication and in the process we have come up with a basic roll-your-own dedupe appliance using Windows Server 2012 R2.

After testing Windows Deduplication on various hardware, we have come to use a simple business-class desktop with an external RAID array as our basic deduplication workhorse. Deduplication wants a fast single-core speed and fast memory, and the cheapest way, by far, is to fulfill these requirements with a desktop system.

A good place to begin is with a Dell 7020 or 9020 Small Form Factor computer with an Intel I7 processor. As of March 2015, a system with 16GB of 1600 memory is about $800. For storage, a USB 3.0 Western Digital Duo drive in either 8TB  $(340) or 12TB ($650) size can be a good choice. The Duo drive is actually 2 drives in an external enclosure. You can set the drives up with internal RAID 1 mirroring, so you get 4TB or 6 TB of usable space.

Plug these together and install Windows Server 2012 R2 on the Dell and you have a deduplication system. Some people like HP instead of Dell. Some brave hearts swear by SuperMicro.

If you need more storage or you want to put your deduplication appliance in a rack, you can use a rackmount RAID enclosure like the USB 3.0 Akitio MD4 U3B ($350) with 4 3.5 inch drives. Set it up as RAID 10 for both speed and protection, using 4TB, 6TB, or 8TB drives. Put it in the rack, and flip the 7020 sideways and put it on top.

At this time, 4 WD RED NAS drives are about $600. So for $1800 and the price of Windows Server 2012 R2 (which can be anywhere up to about $700 retail) you have a 8TB available deduplication appliance.

Since Windows Deduplication is post process, you will need a certain amount of that storage for the raw files before you deduplicate them. Since we are using these systems for deduplicating and replicating backups to our DR site, we need room for at least one day’s full backup plus 50% ‘fudge factor’ (this is a professional term of art from the 1960s, they may call this something else now.) Our daily full backup is a bit over 1 TB, so 8 TB – 1.5 TB is 6.5 TB of deduplication space. At 25 to 1, that is over 150 full backups.

This can be a useful low end SMB system, a Proof of Concept (POC) system, or a departmental system for an enterprise.

Many of our customers want ‘real’ servers and will install hardware that costs 2 to 4 times this much. That is okay too.

The WD DUO 4TB available system costs $350 for the WD DUO 8TB, $800 for a Dell 7020, and the cost of Windows Server 2012 R2.

Real deduplication for $1150 plus Windows.

By the way, I’ve clocked my 4TB (8TB) Duo system deduplication at over 400 billion bytes an hour on full backups after the first day (because the first day is mostly compression).  The Akitio based system is a little bit slower, but still respectable speed for what we are doing.

This is a roll your own price.  When I sell similar systems they cost a lot more, because all my hippies quit and now I have to pay my employees.

You can’t do everything with these that you can with the big name deduplication appliances.  They won’t scale as high – Windows deduplication doesn’t work on a physical disk volume over 64TB in size, for example.

The big guys are claiming ever more dizzying ingest rates for their systems as well. Microsoft claims their R2 version of deduplication tops out about 40 MB a second, but we generally see speeds two to three times that fast. 350 billion bytes an hour to 450 billion bytes an hour is typical with the systems we test. We expect this to continue to increase as processors and memory get faster.

If your full backups are up to about 3 TB a day, one of these systems may work for you. If you are doing periodic full backups and incremental backups the rest of the time, that number could be higher.

With the low cost of these systems, you can divide up the work and have two, four, or even eight deduplication appliances for different backups.

The other thing Microsoft Deduplication is missing is replication, but we have solved that problem. We will be releasing our own replication system, Replacador, very soon.