There have been some notable discussions about VMFS heap size and heap consumption over the past year or so. An issue with previous versions of VMFS heap meant that there were concerns when accessing above 30TB of open files from a single ESXi host. VMware released a number of patches to temporarily work around the issue. ESXi 5.0p5 & 5.1U1 introduced a larger heap size to deal with this. However, I’m glad to say that a permanent solution has been included in vSphere 5.5 in the form of dedicated slab for VMFS pointers and a new eviction process. I will discuss the details of this fix here.
I have had a few occasions recently to start using vscsiStats. For those of you who may be unfamiliar, this is a great tool for virtual machine disk I/O workload characterization. Have you ever wondered about the most common I/O size generated by the Guest OS? What about the latency of those I/Os? What about checking to see the I/O generated by a Guest OS when it is in a so-called ‘idle’ state? vscsiStats can help with all of these queries, as well as providing some excellent troubleshooting options. The tool has been around since the ESX 3.5 days. This blog will take you through some of the steps in getting started with vscsiStats. Continue reading
A number of you have reached out about how to change some of the settings around path policies, in particular how to set the default number of iops in the Round Robin path selection policy (PSP) to 1. While many of you have written scripts to do this, when you reboot the ESXi host, the defaults of the PSP are re-applied and then you have to run the scipts again to reapply the changes. Here I will show you how to modify the defaults so that when you unclaim/reclaim the devices, or indeed reboot the host, the desired settings come into effect immediately.
You may remember an enhancement which we made to Storage I/O Control (SIOC) in the 5.1 vSphere release whereby SIOC can now automatically determine the characteristics and thus the latency threshold of a datastore. Prior to this change, SIOC used either a default value or had customers manually set it. Neither of these were ideal, so we introduced this automatic method. However, there was little detail on how often this latency threshold was calculated. In other words, did the calculation take place when SIOC was first enabled, or is there regular on-going calculations?
This is something that was recently brought to my attention, and I wasn’t aware of this difference in behavior between the various storage vendors who implement VAAI-NAS. VAAI-NAS implements a number of different offload primitives, but the one we are interested in here is the Fast File Clone primitive which is the ability to offload the creation of snapshots/clones to the NAS storage array. This mechanism is also referred to as Native Snapshots. However, some arrays cannot support a full chain of snapshots.