Data Centers

Windows Server 2012 deduplication: How and where to tweak

One of the watershed features of Windows Server 2012 is volume deduplication. Rick Vanover shows where and how to tweak deduplication in this post.

When Windows Server 2012 was being previewed, I was really shocked that deduplication would be included in the server operating system. I thought it would make sense for a file server only at first. This is because it is not meant for structured data types like Exchange, SQL or VMs. Further, it is not permitted on the C:\ drive. After chewing on it a bit more, I’ve found a number of additional use cases and find it quite versatile for any IT environment. Windows deduplication can be used for backup file shares, volumes for application dumps (like SQL server log backups and log flushes) and general application volumes for systems. This may be a candidate for virtual machine template/library build process configuration.

What’s better, it doesn’t require the cost investment that is normally associated with hardware deduplication appliances. And it is Windows. You already know how to support Windows.

With that being said, how do you tweak Windows Server 2012 deduplication? First, you need to know how to install it. I wrote a blog post last year from one of the early previews, and the process is unchanged in how to turn it on since the beta processes.

Once deduplication is enabled for the server, the next step is to designate volumes to have the deduplication engine set for processing. This is easily done in PowerShell or the Server Manager interface. Issue the PowerShell command Enable-DedupVolume for each specified volume -- shown in Figure A below:

Figure A

A number of factors will determine how effective Windows will be in its processing. The first of which is the number of scheduled tasks that Windows will execute to find space savings. These tasks are located in the Windows Task Scheduler in the Microsoft | Windows | Deduplication section. These tasks are shown in Figure B below:

Figure B

Click to enlarge.

Within Server Manager, the Volumes Overview of each disk will give you a quick visual indication of the deduplication space savings from the Windows Server 2012 deduplication engine. Deduplication with Windows works within a single volume (though multiple volumes can have it enabled) and space savings are delivered when the scheduled tasks run.

I set up a simple example where there should be excellent efficiency of a 250 MB file that existed over 60 times on a file server, and after deduplication is enabled on the volume the efficiency is listed at 0% and no space savings are reported. Now, in natural situations, Windows deduplication will take time to deliver savings. So, set your expectations accordingly. While you can run the Background Optimization scheduled task on demand to get deduplication savings, note that this takes CPU and disk cycles to run (as do all deduplication technologies). You can launch an optimization via PowerShell with the start-dedupJob cmdlet as shown below in Figure C:

Figure C

The other note is that deduplication benefits may not be shown right away. The reason for this is that there is a default value of 5 for the file age to have as candidates for deduplication. This option is in the deduplication settings and can be configured per volume. Additionally, this area of options will let you add exclusions so that certain file types are not processed for deduplication. These options are shown in Figure D below:

Figure D

In the figure above, the default value of 5 is removed to 0 in this example so we can see immediate results. It’s important to note it is age on this volume, not necessarily the timestamp of the file.

Once deduplication is completed, the results are displayed in Server Manager. Using the example above with 60 instances of the same file, and some other candidates, deduplication saved a lot of space, as shown in Figure E below:

Figure E

Click to enlarge.

Between the properties of deduplication for the volume set in Server Manager and the wealth of PowerShell cmdlets for Windows, there are plenty of places and ways to tweak this feature of Windows. Additionally, this TechNet page has an extensive list of the PowerShell cmdlets available specifically for deduplication.

Have you been using Windows Server 2012 deduplication? What is your experience so far? Share your experiences below.

About

Rick Vanover is a software strategy specialist for Veeam Software, based in Columbus, Ohio. Rick has years of IT experience and focuses on virtualization, Windows-based server administration, and system hardware.

4 comments
Intunericitu
Intunericitu

I have in a test environment, 6 files VHDX "parent disks" hosted on a 60 gb SSD hard drive used for Hyper-V role. This saves about 68% of total storage space. I think it's one of the conditions recommended because files VHDX "parent disks" are read-only and the SSD drive has no other reason to be written by other processes. Thus, the deduplication process is fully optimized.

Gisabun
Gisabun

So exactly what deduplication is? Would of been nice.

pethers
pethers

Deduplication saves space on a hard drive by only keeping one copy of any files that may be identical. For example, a large bunch of staff might all receive an email with an attachment and then all save that attachment to a network drive on a server - all to different folder locations like into their own home drive. The Deduplication will recognise files that are the same and only keep one and link all instances to that one file - resulting in saving space.

b4real
b4real

Deduplication is the removal of like blocks on disk to save space.

Editor's Picks