I was involved in some conversations recently on how the VAAI UNMAP command behaved, and what were the characteristics which affected its performance. For those of you who do not know, UNMAP is our mechanism for reclaiming dead or stranded space from thinly provisioned VMFS volumes. Prior to this capability, the ESXi host had no way of informing the storage array that the space that was being previously consumed by a particular VM or file is no longer in use. This meant that the array thought that more space was being consumed than was actually the case. UNMAP, part of the vSphere APIs for Array Integration, enables administrators to overcome tho challenge by telling the array that these blocks on a thin provisioned volume are no longer in use and that they can be reclaimed.
There are many occasions where the information displayed in the vSphere client is not sufficient to display all relevant information about a particular storage device, or indeed to troubleshoot problems related to a storage device. The purpose of this post is to explain some of the most often used ESXCLI commands that I use when trying to determine storage device information, and to troubleshoot a particular device.
Continuing on the series of vSphere 5.5 Storage Enhancements, we now come to a feature that is close to many people’s hearts. The vSphere Storage API for Array Integration (VAAI) UNMAP primitive reclaims dead or stranded space on a thinly provisioned VMFS volume, something that we could not do before this primitive came into existence. However, it has a long and somewhat checkered history. Let me share the timeline with you before I get into what improvements we made in vSphere 5.5.
There have been some notable discussions about VMFS heap size and heap consumption over the past year or so. An issue with previous versions of VMFS heap meant that there were concerns when accessing above 30TB of open files from a single ESXi host. VMware released a number of patches to temporarily work around the issue. ESXi 5.0p5 & 5.1U1 introduced a larger heap size to deal with this. However, I’m glad to say that a permanent solution has been included in vSphere 5.5 in the form of dedicated slab for VMFS pointers and a new eviction process. I will discuss the details of this fix here.
In my recent post about the new large 64TB VMDKs available in vSphere 5.5, I mentioned that one could not hot-extend a VMDK (i.e. grow the VMDK while the VM is powered on) to the new larger size due to some Guest OS partition formats not being able to handle this change on-the-fly. The question was whether hot-extend was possible if the VMDK was already 2TB or more in size. I didn’t know the answer, so I decided to try a few tests on my environment. Continue reading
Our GSS folks just released KB article 2049103 which details a VMFS Heartbeat and Lock corruption issue that manifests itself on DELL EqualLogic storage arrays when running PS Series firmware v6.0.6. As per the KB:
A VMFS datastore has a region designated for heartbeat types of operations to ensure that distributed access to the volume occurs safely. When files are being updated, the heartbeat region for those files is locked by the host until the update is complete.
In this scenario, the heartbeat region has become corrupt.
This is something which has come up numerous times, and behavior which many of you have observed. There seems to be some issue with uploading files to a VMFS datastore. In fact, in one example, we had someone report that it took 10 minutes to upload a Windows 7 ISO to an iSCSI datastore and less than 1 minute to upload the same ISO to an NFS datastore. Both datastores were very healthy and fast, and both had running VMs on them. There have been variations of this behavior reported before. This post will try to explain why.