One of the really nice new features of VSAN 6.0 is fault domains. Previously, there was very little control over where VSAN placed virtual machine components. In order to protect against something like a rack failure, you may have had to use a very high NumberOfFailuresToTolerate value, resulting in multiple copies of the VM data dispersed around the cluster. With VSAN 6.0, this is no longer a concern as hosts participating in the VSAN Cluster can be placed in different failure domains. This means that component placement will take place across failure domains and not just across hosts. Let’s look at this in action.
I’ve been hit up this week by a number of folks asking about “ATS Miscompare detected between test and set HB images” messages after upgrading to vSphere 5.5U2 and 6.0. The purpose of this post is to give you some background on why this might have started to happen.
First off, ATS is the Atomic Test and Set primitive which is one of the VAAI primitives. You can read all about VAAI primitives in the white paper. HB is short for heartbeat. This is how ownership of a file (e.g VMDK) is maintained on VMFS, i.e. lock. You can read more about heartbeats and locking in this blog post of mine from a few years back. In a nutshell, the heartbeat region of VMFS is used for on-disk locking, and every host that uses the VMFS volume has its own heartbeat region. This region is updated by the host on every heartbeat. The region that is updated is the time stamp, which tells others that this host is alive. When the host is down, this region is used to communicate lock state to other hosts.
In vSphere 5.5U2, we started using ATS for maintaining the heartbeat. Prior to this release, we only used ATS when the heartbeat state changed. For example, referring to the older blog, we would use ATS in the following cases:
- Acquire a heartbeat
- Clear a heartbeat
- Replay a heartbeat
- Reclaim a heartbeat
We did not use ATS for maintaining the ‘liveness’ of a heartbeat. This is the change that was introduced in 5.5U2 and which appears to have led to issues for certain storage arrays.
There have been a number of queries around Virtual Volumes (VVols) and replication, especially since the release of KB article 2112039 which details all the interoperability aspects of VVols.
In Q1 of the KB, the question is asked “Which VMware Products are interoperable with Virtual Volumes (VVols)?” The response includes “VMware vSphere Replication 6.0.x”.
In Q2 of the KB, the question is asked “Which VMware Products are currently NOT interoperable with Virtual Volumes (VVols)?” The response includes “VMware Site Recovery Manager (SRM) 5.x to 6.0.x”
In Q4 of the KB, the question is asked “Which VMware vSphere 6.0.x features are currently NOT interoperable with Virtual Volumes (VVols)?” The response includes “Array-based replication”.
So where does that leave us from a replication/DR standpoint with VVols?
Before I begin, this isn’t really a feature of VSAN so to speak. In vSphere 6.0, you can also blink LEDs on disk drives without VSAN deployed. However, because of the scale up and scale out features in VSAN 6.0, where you can have very many disk drives and very many ESXi hosts, being able to identify a drive for replacement becomes very important.
So this is obviously a useful feature. And of course I wanted to test it out, see how it works, etc. In my 4 node cluster, I started to test this feature on some disks in each of the hosts. On 2 out of the 4 hosts, this worked fine. On the other 2, it did not. Eh? These were all identically configured hosts, all running 6.0 GA with the same controller and identical disks. In the ensuing investigation, this is what was found.
In some previous posts, I highlighted how VVols introduces the concept of “undo” format snapshots where the VM is always running on the base disk. I also mentioned that this has a direct impact on the way that we do snapshots on VMs that support VSS, the Microsoft Volume Shadow Copy Service. But before getting into the detail regarding how VVols is different, it’s worth spending some time understanding whats going on when VSS is called to quiesce applications when a traditional snapshot is taken. If you try to research this yourself, you’ll find that there is very little information describing what is going on. The best place I found this behaviour described is in the Designing Backup Solutions for VMware vSphere vStorage APIs for Data Protection 1.2:
Windows 2008 application level quiescing is performed using a hardware snapshot provider. After quiescing the virtual machine, the hardware snapshot provider creates two redo logs per disk: one for the live virtual machine writes and another for the VSS and writers in the guest to modify the disks after the snapshot operation as part of the quiescing operations. The snapshot configuration information reports this second redo log as part of the snapshot. This redo log represents the quiesced state of all the applications in the guest. This redo log must be opened for backup with VDDK 1.2.
There is a subtle difference in maintenance mode behaviours between VSAN version 5.5 and VSAN version 6.0. In Virtual SAN version 5.5, when a host is placed into maintenance mode with the “Ensure Accessibility” option, the host is maintenance mode continues to contribute its storage towards the VSAN datastore. In other words, any VMs that had components stored on this host still remained fully compliance with all of the components available. In VSAN 6.0, this behaviour changed. Now, when a host is placed into maintenance mode, it no longer contributes storage to the VSAN datastore, and any components that reside on the physical storage of the host that is placed into maintenance mode is marked as absent. The following screen shots show the behaviour.
There is a new snapshot format introduced in VSAN 6.0 called vsanSparse. These replace the traditional vmfsSparse format (redo logs). The vmfsSparse format was used when snapshots of VMs were taken in VSAN 5.5, and are also the format used when a snapshot is taken of a VM residing on traditional VMFS and NFS. The older vmfsSparse format left a lot to be desired when it came to performance and scalability. This KB article from our support team, indicating that no snapshot should be used for more than 72 hours, and snapshot chains should contain no more than 2-3 snapshots, speaks for itself.
This new vsanSparse snapshot format leverages features of the new (v2) on-disk format in VSAN 6.0, VirstoFS. VirstoFS is the first implementation of technology that was acquired when VMware bought a company called Virsto a number of years ago. You can get an overview of this company from this blog post I did prior to the acquisition.