Host Monitoring
vSphere HA should most definitely be turned on. This will make your virtual machines highly available in the VSAN stretched cluster. Host monitoring should also be enabled. This will allow hosts in the vSphere HA cluster to exchange heartbeats over the network, and ensure that all nodes continue to participate in the cluster, and are healthy.
Host Hardware Monitoring – VM Component Protection
VSAN does not support VMCP, VM Component Protection, at this time. Therefore this should be left unchecked.
This monitors the heartbeats of the virtual machines, and restarts the virtual machine if the heartbeats are not received over a period of time. This setting is optional, and is left up to the customers discretion. VMware supports having this feature either enabled or disabled.
Failure conditions and VM response
This is where the host isolation response is placed. Consider a situation where a network failure results in a host being isolated from the rest of the cluster. What do you wish to happen to those virtual machines that are on the isolated host? The VMware recommendation, when using vSphere HA in a VSAN stretched cluster, is to have the VMs powered off and restarted.
Admission Control
VMware supports an active/active VSAN stretched cluster configuration, in other words, running virtual machines at both data sites. Given this, we feel that admission control should be configured in such a way that will allow the complete workload to run on one remaining site if there is a complete site failure. With that in mind, the recommendation is to set admission control to a percentage value of 50%. This will leave 50% of the cluster’s CPU and Memory resources free, and should ensure that one data site can run all the virtual machines in the event of a complete failure of the other site.
Datastore for heartbeating
VSAN does not support heartbeat datastore functionality, so this needs to be disabled. There is no disable button for heartbeat datastores, so if there are VMFS volumes or NFS volumes presented to the hosts in the cluster, these datastores may be automatically chosen for heartbeat datastores. This could result in unpredictable behaviour in the VSAN stretched cluster, especially when it comes to failover events. Therefore customers need to ensure that heartbeat datastores is not in use. If you have datastores presented to the hosts other than the VSAN datastore, you should select the “Use datastores only from the specified list, and then not select any, as shown below:
If there are no other datastores, and only the VSAN datastore, then this is not a concern and can be left at the default.
Advanced Options
There are a number of advanced options that need to be added to ensure that host isolation works correctly when vSphere HA is configured on a VSAN stretched cluster. In a VSAN stretched cluster, one of the isolation addresses should reside in the site 1 data center and the other should reside in the site 2 data center. This would enable vSphere HA to validate complete network isolation in the case of a connection failure between sites. VMware recommends enabling host isolation response and specifying an isolation response addresses that is on the VSAN network rather than the default gateway on the management network. Therefore the vSphere HA advanced setting das.usedefaultisolationaddress should be set to false.
As stated, VMware recommends specifying two isolation response addresses, and each of these addresses should be site specific. In other words, select an isolation response IP address from the preferred VSAN stretch cluster site and another isolation response IP address from the secondary VSAN stretch cluster site. The vSphere HA advanced setting used for setting the first isolation response IP address is das.isolationaddress0 and it should be set to an IP address on the VSAN network which resides on the first site. The vSphere HA advanced setting used for adding a second isolation response IP address is das.isolationaddress1 and this should be an IP address on the VSAN network that resides on the second site.
Summary
Here is a summary of all the settings needed when enabling vSphere HA on top of VSAN stretched cluster.
vSphere HA | Turn on |
Host Monitoring | Enabled |
Host Hardware Monitoring – VM Component Protection: “Protect against Storage Connectivity Loss” | Disabled (default) |
Virtual Machine Monitoring | Customer Preference – Disabled by default |
Admission Control | Define failover capacity by reserving a percentage of cluster resources. Set to 50% for both CPU & Memory. |
Host Isolation Response | Power off and restart VMs |
Datastore Heartbeats | “Use datastores only from the specified list”, but do not select any datastores from the list. This disables Datastore Heartbeats |
Advanced Settings:
das.usedefaultisolationaddress | False |
das.isolationaddress0 | IP address on VSAN network on site 1 |
das.isolationaddress1 | IP address on VSAN network on site 2 |
The VSAN 6.1 Stretched Cluster Guide, which is now available, will cover these settings in more detail. Please refer to this guide if planning to implement a VSAN stretched cluster.