I/O Scheduler Queues Improvement for Virtual Machines

This is a new feature in vSphere 6.0 that I only recently became aware of. Prior to vSphere 6.0, all the I/Os from a given virtual machine to a particular device would share a single I/O queue. This would result in all the I/Os from the VM (boot VMDK, data VMDK, snapshot delta) queued into a single per-VM, per-device queue. This caused I/Os from different VMDKs interfere with each other and could actually hurt fairness.

For example, if a VMDK was used by a database, and this database issued a lot of I/O, this could compete with I/Os from the boot-disk. This in turn could make it appear that the VM (Guest OS) is running slowly.

Continue reading

Using NexentaConnect for file shares on VSAN

nexenta_partnerlogoI already wrote an article on the NexentaConnect for VSAN product after seeing it in action at VMworld last year. More recently, I had the opportunity to play with it in earnest. Rather than giving you the whole low-down on NexentaConnect, instead I will use this post to show the steps involved in presenting a file share built by NexentaConnect to a VM. In this case, the VM and the file share both reside on Virtual SAN. I will also show you how to simply revert to a point-in-time snapshot of the file share using NexentaConnect. To answer the common question, “can VSAN do file shares as well as storing virtual machines?”, the answer is yes. This post will show you how.

Continue reading

Using HyTrust to encrypt VMDKs on VSAN

hytrustI’ve had an opportunity recently to get some hands-on with HyTrust’s Data Control product to do some data encryption of virtual machine disks in my Virtual SAN 6.0 environment. I won’t deep dive into all of the “bells and whistle” details about HyTrust – my good buddy Rawlinson has already done a tremendous job detailing that in this blog post. Instead I am going to go through a step-by-step example of how to use HyTrust and show how it prevents your virtual machine disk from being snooped. In my case, I am encrypting virtual machine disks from VMs that are deployed on VSAN, as I have had this question in the past, i.e. can VMDKs on VSAN be encrypted? The answer is yes. This post will show you how.

Continue reading

Heads Up! ATS Miscompare detected between test and set HB images

heartbeatI’ve been hit up this week by a number of folks asking about “ATS Miscompare detected between test and set HB images” messages after upgrading to vSphere 5.5U2 and 6.0. The purpose of this post is to give you some background on why this might have started to happen.

First off, ATS is the Atomic Test and Set primitive which is one of the VAAI primitives. You can read all about VAAI primitives in the white paper. HB is short for heartbeat. This is how ownership of a file (e.g VMDK) is maintained on VMFS, i.e. lock. You can read more about heartbeats and locking in this blog post of mine from a few years back. In a nutshell, the heartbeat region of VMFS is used for on-disk locking, and every host that uses the VMFS volume has its own heartbeat region. This region is updated by the host on every heartbeat. The region that is updated is the time stamp, which tells others that this host is alive. When the host is down, this region is used to communicate lock state to other hosts.

In vSphere 5.5U2, we started using ATS for maintaining the heartbeat. Prior to this release, we only used ATS when the heartbeat state changed. For example, referring to the older blog, we would use ATS in the following cases:

  • Acquire a heartbeat
  • Clear a heartbeat
  • Replay a heartbeat
  • Reclaim a heartbeat

We did not use ATS for maintaining the ‘liveness’ of a heartbeat. This is the change that was introduced in 5.5U2 and which appears to have led to issues for certain storage arrays.

Continue reading

VSAN 6.0 Part 5 – new vsanSparse snapshots

There is a new snapshot format introduced in VSAN 6.0 called vsanSparse. These replace the traditional vmfsSparse format (redo logs). The vmfsSparse format was used when snapshots of VMs were taken in VSAN 5.5, and are also the format used when a snapshot is taken of a VM residing on traditional VMFS and NFS. The older vmfsSparse format left a lot to be desired when it came to performance and scalability. This KB article from our support team, indicating that no snapshot should be used for more than 72 hours, and snapshot chains should contain no more than 2-3 snapshots, speaks for itself.

This new vsanSparse snapshot format leverages features of the new (v2) on-disk format in VSAN 6.0, VirstoFS. VirstoFS is the first implementation of technology that was acquired when VMware bought a company called Virsto a number of years ago. You can get an overview of this company from this blog post I did prior to the acquisition.

Continue reading

VSAN 6.0 Part 3 – New Default Datastore Policy

One of the most common issues I got questions about in VSAN 5.5 was “why is VSAN deploying thick disks, when all of the documentation stated that VSAN deploys thin disks”?

The answer was quite straight forward, and was due to the fact that the VMs were being deployed without a VM Storage Policy. This meant that it went through the standard VM deployment wizard which offered administrators the option of thin, lazy-zeroed thick (LZT) and eager-zeroed thick (EZT). The default option is LZT, which if you just do click-click-click (just like I do) when deploying a VM, then you end up deploying a LZT format VM, even on the VSAN datastore. I described this issue in this older blog post. Its only when you select an actual VM Storage Policy when deploying a VM that VSAN uses the Object Space Reservation capability, which by default is 0%, meaning that the VM is effectively thinly provisioned. We realized that this was causing some issues for customers so we improved this whole deployment mechanism in 6.0 with the introduction of Datastore Default policies.

Continue reading

Virtual Volumes – A new way of doing snapshots

VVolsI learnt something interesting about Virtual Volumes (VVols) last week. It relates to the way in which snapshots have been implemented in VVols. Historically, VM snapshots have left a lot to be desired. So much so, that GSS best practices for VM snapshots as per KB article 1025279 recommends having on 2-3 snapshots in a chain (even though the maximum is 32) and to use no single snapshot for more than 24-72 hours. VVol mitigates these restrictions significantly, not just because snapshots can be offloaded to the array, but also in the way consolidate and revert operations are implemented.

Continue reading