vSphere 5.5 Storage Enhancements Part 4: UNMAP
Continuing on the series of vSphere 5.5 Storage Enhancements, we now come to a feature that is close to many people’s hearts. The vSphere Storage API for Array Integration (VAAI) UNMAP primitive reclaims dead or stranded space on a thinly provisioned VMFS volume, something that we could not do before this primitive came into existence. However, it has a long and somewhat checkered history. Let me share the timeline with you before I get into what improvements we made in vSphere 5.5.
History of UNMAP
vSphere 5.0 introduced the VAAI UNMAP primitive is automatic mode, which mean that once a virtual machine was migrated or deleted from a thinly provisioned VSAN datastore, the dead space was automatically reclaimed by the array.
We very quickly had reports from customers that this wasn’t working as expected. I blogged about it here about 2 years ago. We basically asked customers to turn the feature off. We turned it off ourselves in vSphere 5.0 patch 2.
We then brought support for the VAAI UNMAP primitive back in vSphere 5.0U1. However the automated aspect was disabled and the reclaims now has to be run manually using a vmkfstools command. I did a blog post on how you would go through the process of a manual reclaim. To be honest, we did not do a great job in providing details on how to do the reclaim, and we needed customers to figure out stuff like free space, etc.
UNMAP in vSphere 5.5
This leads us nicely to what enhancements we made around this feature in vSphere 5.5. Basically we have introduced a much simpler command in the ESXCLI namespace:
There are a few other nice features around the new command. The reclaim size can now be specified in blocks rather than a percentage value (which is what was required for vmkfstools -y) and make the reclaim process much more intuitive. Also, dead space can now be reclaimed in increments rather than all at once.
What about automation?
We do understand the demand for having dead space reclaimed automatically. We continue to look at ways to do this without impacting a production environment by overloading the storage system with UNMAP requests. I’ll share more details when I can.
Hi Cormac,
Thanks for the information.
The thing is that with SDRS turned on, you might have quite some migrations over time, which then leads to LUNs being reported as full on the storage side.
With VASA activated, these datastores will not be used as migration targets, even with the datastore empty.
So with automated migrations it is really crucial to have automated UNMAP if you want to avoid hiring a person doing nothing but manual unmaps 😉
The only workaround today if you want to keep VASA is to disabled storage alarms completely.
Regards,
Tom
Not sure I get the VASA reference Tom – I assume you mean VAAI. And yes, if the datastores are thin provisioned in the SDRS cluster, you are correct – you will leave stranded space that will require UNMAP to free/reclaim them.
We know that the current manual reclaim mechanism isn’t ideal – but think about what you had to do before. We’re continuing to investigate way to make this less of a management headache – all I can say is ‘watch this space’ for now.
Hi,
The problem comes through VASA. As the storage reports 100% used capacity for the LUN, VASA reports this to vCenter server. Anything above the threshold of 75% will result in an alert on the datastore (thin provisioning capacity threshold alert for naa.XXX) even with a VMFS datastore being empty.
I had the problem, and needed to disable storage alarms in the VASA provider as a workaround.
Regards,
Tom
Ah yes – now I get it. Good point.
Ive seen a lot of arrays (not going to name names) slam CPU to 100% or cause all kinds of latency issues during UNMAP. VMware can do all they can on this side, but until the array vendors fix their problems this is going to be a problem with Thin Provisioning and SDRS. In the meanwhile I’m relying on Zero Page Reclaim (most arrays can grovel at their native page size, easier than at the VMware block size.
Is there a difference between cloning a vm offline and online with vaai?
http://vbry21.wordpress.com/tag/vaai/
This plug-in enables NAS arrays to integrate with vSphere to transparently offload certain storage operations to the array, such as offline cloning (cold migrations, cloning from templates).
Partical I did following on NFS Storage with VAAI enabled: Clone a powered on vm takes long time. Clone a powered off vm is very fast.
Storage VMotion of running VMs using VAAI-NAS will not offload to the array. Cold migration of powered off VMs will offload. This is different to VAAI-Block implementations where both operations are offloaded. It’s a significant difference between the implementations, and explains why one is so slow in your environment and the other is fast.