VSAN 6.2 Part 1 – Deduplication and Compression

Cormac

8 years ago

Now that VSAN 6.2 is officially launched, it is time to start discussing some of the new features that we have introduced into our latest version of Virtual SAN. Possibly one of the most eagerly anticipated feature is the introduction of deduplication and compression, two space efficiency techniques that will reduce the overall storage consumption of the applications running in virtual machines on Virtual SAN. Of course, this also lowers the economics of running an all-flash VSAN, and opens up all-flash VSAN to multiple use cases.

A brief overview of compression and deduplication on VSAN

Most readers are probably familiar with the concept of deduplication. It has been widely used in the storage industry for some time now. In a nutshell, deduplication checks to see whether a block of data is already persisted on storage. If it is, rather than storing the same block twice, a small reference is created to the already existing block. If the same block of data occurs many times, significant space savings are achieved.

In all-flash VSAN, which is where deduplication and compression are supported, data blocks are kept in the cache tier while it is active/hot for optimal performance. As soon as the data is no longer active (cold), it is destaged to the capacity tier. It is during this destaging process that VSAN does the deduplication (and compression) processing.

Deduplication on VSAN uses the SHA-1 hashing algorithm, creating a “fingerprint” for every data block. This hashing algorithm ensures that no two blocks of data result in the same hash, so that all blocks of data are uniquely hashed. When a new block arrives in, it is hashed and then compared to the existing table of hashes. If it already exists, then there is no need to store this new block. VSAN simply adds a new reference to it. If it does not already exist, a new hash entry is created and the block is persisted.

Another new space-saving technique in VSAN 6.2 is compression. VSAN uses the LZ4 compression mechanism, and it works on 4KB blocks. If a new block is found to be unique, it also goes through compression. If the LZ4 compression manages to reduce the size of the block to less than or equal to 2KB, then the compressed version of the block is persisted to the capacity tier. If compression cannot reduce the size to less than 2KB, then the full-sized block is persisted. We do it this way (deduplication followed by compression) because if the block already exists, then we don’t have to pay the compression penalty for that block.

Is this inline or post process?

Many readers who follow online community debates will be well aware of past heated discussions when it comes to whether a deduplication process is done inline or post process. I’m sure we will see something similar around VSAN. For me, this method is neither inline nor is it post process. Since the block is not deduplicated or compressed in cache, but it is when the block is moved to the capacity tier, we’ve decided to use the term “near-line” to describe this approach.

There is one major advantage with this approach; as your applications are writing data, the same block may be over-written multiple times in the cache tier. On all-flash VSAN, this block will be over-written numerous times in the high performance, high endurance cache tier/write buffer. Once the block is cold (no longer used), it is moved to the capacity tier. It is only at this point does it go through the deduplication and compression processing. This is a significant saving on overhead, as cycles are not wasted deduplicating and compressing a block that is overwritten immediately afterwards, or multiple times afterwards.

Enabling deduplication and compression

This is very straightforward, like one would expect from VSAN. Simply navigate to the VSAN management view, and here you can edit the option to enable these space savings techniques, and change it from disabled to enabled, as shown below:

Allow Reduced Redundancy

In the above screen shot, you might ask what is that “Allow Reduced Redundancy” check-box about? Well, enabling deduplication and compression requires an on-disk format change. If you have been through this process before, either upgrading from VMFS-L (v1) to VirstoFS (v2), or even upgrading to VSAN 6.2 (v3), you will be aware of the methodology use. VSAN evacuates data from the disk groups, removes them and then rebuilds them with the new on-disk format, one at a time. Then we go through a rinse-and-repeat for all disk groups. The same applies to deduplication and compression. VSAN will evacuate the disk group, remove it and recreate it with a new on-disk format that supports these new features.

Now just like previous on-disk format upgrades, you may not have enough resources in the cluster to allow the disk group to be fully evacuated. Maybe this is a three-node cluster, and there is nowhere to evacuate the replica or witness while maintaining full protection. Or it could be a 4-node cluster with RAID-5 objects already deployed. In this case, there is no place to move part of the RAID-5 stripe (since RAID-5 objects require 4 nodes). It might also simply mean that you have consumed a significant amount of disk capacity with no room to store all the data on fewer disk groups. In any case, you still have the option of enabling deduplication and compression, but with the understanding that you will be running with some risk during the process, since there is no place to move the components. This option will allow the VMs to stay running, but they may not be able to tolerate the full complement of failures defined in the policy during the on-disk format change for dedupe/compression. With this option, VSAN removes components from the objects, rebuilds the disk group with the new on-disk format, and rebuild the component before moving onto the next step.

Monitoring deduplication and compression space-saving

VSAN 6.2 has introduced a new section to the UI called Capacity views. From here, not only can administrators see the overheads of filesystem on-disk formats and features such as deduplication and compression, but you can also see where capacity is being consumed on a per-object type. I’ll write more about these new capacity views in another post. However, if you want to see the space-saving that you are achieving with deduplication and compression, this is the place to look. The new UI will also display how much space is required to inflate and deduplicated and compressed objects, should you decide to disable the space-saving features at some point in the future.

Deduplication and compression with Object Space Reservation

There is a major consideration if you already use object space reservation (OSR) and now wish to use deduplication and compression. In essence, objects either need to have 0% OSR (thin) or 100% OSR (fully reserved – thick). You cannot have any values in-between. If you already have VMs deployed with an OSR value that is somewhere in between, you will not be able to enable deduplication and compression. It will error as shown here:

You need to remove that policy from VMs if you wish to use deduplication and compression. Alternatively set it to a value of 0% or 100%

Other considerations

The first thing to note is that deduplication and compression is only available on all-flash VSAN configurations. The second this to note is that this feature will make you reconsider the way you scale up your disk groups. In the past, administrators could add and remove disks to disk groups as the need arose. While you can still add individual disks to a disk group as you scale up, the advice is, if you plan to use deduplication and compression, build fully populated disk groups in advance. Then enable the space efficiency techniques on the cluster. Because the hash tables are spread out across all the capacity disks in a disk group when deduplication and compression are enabled, it is not possible to remove disks from a disk group after the space savings features are enabled. It should also be taken into consideration that a failure of any of the disks in the disk group will impact the whole of the disk group. You will have to wait for the data in the disk group to be rebuilt elsewhere in the cluster, address the failing component and recreate the disk group, followed by a re-balance of the cluster. These are some of the considerations you will need to take into account from an administrators perspective when leveraging the saving capacities of deduplication and compression on VSAN.