I/O Scheduler Queues Improvement for Virtual Machines

This is a new feature in vSphere 6.0 that I only recently became aware of. Prior to vSphere 6.0, all the I/Os from a given virtual machine to a particular device would share a single I/O queue. This would result in all the I/Os from the VM (boot VMDK, data VMDK, snapshot delta) queued into a single per-VM, per-device queue. This caused I/Os from different VMDKs interfere with each other and could actually hurt fairness. For example, if a VMDK was used by a database, and this database issued a lot of I/O, this could compete with I/Os from the…

VSAN Design & Sizing – Memory overhead considerations

This week I was in Berlin for our annual Tech Summit in EMEA. This is an event for our field folks in EMEA. I presented a number of VSAN sessions, including a design and sizing session. As part of that session, the topic of VSAN memory consumption was raised. In the past, we’ve only ever really talked about the host memory requirements for disk group configuration as highlighted in this post here. For example, as per the post, to a run a fully configured Virtual SAN system, with 5 fully populated disk groups per host, and 7 disks in each…

SMP-FT support on Virtual SAN

There have been a number of questions recently about SMP-FT on Virtual SAN. The Symmetric Multi-Processing Fault Tolerance (SMP-FT) is a feature that many VMware customers have been waiting for. With the release of vSphere 6.0, the SMP-FT capability  finally became available. This release did not include SMP-FT support when the VM was run on VSAN however. With the release of vSphere 6.0U1, which included VSAN 6.1, there is now support for SMP-FT when the VM is run on VSAN. There are some caveats when it comes to the different VSAN deployment methodologies: On standard VSAN deployments, SMP-FT is supported…

VSAN resync behaviour when failed component recovers

I had this question a number of times now. Those of you familiar with VSAN will know that if a component goes absent for a period of 60 minutes (default) then VSAN will begin rebuilding a new copy of the component elsewhere in the cluster (if resources allow it). The question then is, if the missing/absent/failed component recovers and becomes visible to VSAN once again, what happens? Will we throw away the component that was just created, or will we throw away the original component that recovered?

VSAN Proactive Rebalance not starting

Some time back I wrote about proactive rebalancing, a new feature of VSAN 6.0. However I have had a number of queries recently about its functionality. The most common query is that when the proactive rebalance operation is started, there doesn’t appear to be any rebuild/resync activity, even though the command output lists a number of disks that need to be rebalanced (rebalancing moves components between physical disks so that each disk is equally consumed).

DRS and VM/Host Affinity Groups in VSAN Stretched Cluster

In a previous post, I talked about how vSphere HA is used extensively in VSAN Stretched Cluster. The primary purpose of vSphere HA is to restart virtual machines in the event of a failure. However to ensure that the restarted virtual machines continue to perform optimally, and to continue using a warmed cache, I mentioned that we need to use VM/Host affinity rules to achieve this. In this post I want to discuss the role of DRS and VM/Host affinity rules in more detail, and how they are used in VSAN stretched cluster.