I was involved in some conversations recently on how the VAAI UNMAP command behaved, and what were the characteristics which affected its performance. For those of you who do not know, UNMAP is our mechanism for reclaiming dead or stranded space from thinly provisioned VMFS volumes. Prior to this capability, the ESXi host had no way of informing the storage array that the space that was being previously consumed by a particular VM or file is no longer in use. This meant that the array thought that more space was being consumed than was actually the case. UNMAP, part of the vSphere APIs for Array Integration, enables administrators to overcome tho challenge by telling the array that these blocks on a thin provisioned volume are no longer in use and that they can be reclaimed.
I had the pleasure (?) recently of troubleshooting some backup issues on my vSphere Data Protection Advanced (VDPA) setup. To be honest, I had not spent a great deal of time on this product recently, other than a few simple backup and restores. However, in my new role I now have a number of other projects which requires me to understand this product’s functionality a bit more. When things were not going right for me though, I spent a lot of time searching for some log files which might give me some clue as to the nature of my problem. After some assistance from some of the GSS guys based in Cork, we narrowed it down.
For me, the install and configure part went fine. I could also create backups with relative ease. The issue related to running the backups in certain environments, which were failing. So how then could I determine why this was happening?
There are many occasions where the information displayed in the vSphere client is not sufficient to display all relevant information about a particular storage device, or indeed to troubleshoot problems related to a storage device. The purpose of this post is to explain some of the most often used ESXCLI commands that I use when trying to determine storage device information, and to troubleshoot a particular device.
This is a topic which has been discussed time and time again. It relates to an advanced storage parameter called Disk.SchedNumReqOutstanding, or DSNRO for short. There are a number of postings out there on the topic, without me getting into the details once again. If you wish to learn more about what this parameter does for you, I recommend reading this post on DSNRO from my good pal Duncan Epping. Suffice to say that this parameter is related to virtual machine I/O fairness. In this post, I’ll talk about changes to DSNRO in vSphere 5.5.
There have been some notable discussions about VMFS heap size and heap consumption over the past year or so. An issue with previous versions of VMFS heap meant that there were concerns when accessing above 30TB of open files from a single ESXi host. VMware released a number of patches to temporarily work around the issue. ESXi 5.0p5 & 5.1U1 introduced a larger heap size to deal with this. However, I’m glad to say that a permanent solution has been included in vSphere 5.5 in the form of dedicated slab for VMFS pointers and a new eviction process. I will discuss the details of this fix here.
I have had a few occasions recently to start using vscsiStats. For those of you who may be unfamiliar, this is a great tool for virtual machine disk I/O workload characterization. Have you ever wondered about the most common I/O size generated by the Guest OS? What about the latency of those I/Os? What about checking to see the I/O generated by a Guest OS when it is in a so-called ‘idle’ state? vscsiStats can help with all of these queries, as well as providing some excellent troubleshooting options. The tool has been around since the ESX 3.5 days. This blog will take you through some of the steps in getting started with vscsiStats. Continue reading
A number of you have reached out about how to change some of the settings around path policies, in particular how to set the default number of iops in the Round Robin path selection policy (PSP) to 1. While many of you have written scripts to do this, when you reboot the ESXi host, the defaults of the PSP are re-applied and then you have to run the scipts again to reapply the changes. Here I will show you how to modify the defaults so that when you unclaim/reclaim the devices, or indeed reboot the host, the desired settings come into effect immediately.