As part of the Virtual SAN 6.1 announcements at VMworld 2015, VMware announced two new, eagerly anticipated features. The first of these is VSAN stretched cluster, allowing you to protect your virtual machines across data centers, not just racks. And the second is 2-node VSAN, which will be an excellent solution for remote office/branch office (ROBO) configurations. To allow these configuration to work, a dedicated witness host is required. For those of you already familiar with VSAN, a witness component is used in the event of a split brain to figure out if the virtual machine objects have a quorum. If there is more than 50% of the VSAN’s object is still available, and there is a fully copy of the data object, then the VM’s object remains available. Witnesses contribute towards this “greater than” 50% figure. You can read more about the witness here.
Physical or Virtual Witness
Now, rather than have customers dedicate a physical ESXi host to be a witness host, as well as consume a license for the witness host, VMware has developed the VSAN witness appliance (which is essentially an ESXi running in a VM) to take care of the witness requirements. This is achievable because a witness host requires a lot less capacity, bandwidth and performance when compared to hosts in regular VSAN clusters, or hosts on data sites in a VSAN stretched cluster. The purpose of the witness appliance is to store the virtual machine witness components, and when there is a failure or a split brain situation, it contributes towards the object quorum so that the virtual machine can remain available. This is a significant cost saving for customers who wish to implement a VSAN stretched cluster, or indeed a two node VSAN ROBO deployment. Even better, there are no licenses needed with the witness appliance, as it comes preconfigured with a license. Of course, if you wish to use a physical host for the witness, you can do that too.
You should note however that there are some rules governing the witness appliance. A witness appliance cannot be shared between configurations; it is a 1:1 relationship with a stretched cluster or with a ROBO configuration. The bandwidth and latency requirements between the witness site and the data site must also be met.
- For stretched cluster, there is a 5ms RTT between data sites and a 100ms – 200ms RTT between the data sites and the witness, depending on the size of the configuration.
- For two node ROBO configurations, there is a 5ms RTT between data sites and a 500ms RTT between the data sites and the witness.
Witness Appliance Configurations
When deploying the witness appliance, there are a number of different configurations that can be chosen depending on the size of the environment. This all essentially boils down to the number of VMs, and thus the number of witness components that you expect to reside on the witness appliance. These are the different configurations, and they are chosen during appliance deployment. Note that the SSD is a VMDK tagged as an SSD. There is no requirement to use a flash device in the appliance.
- Tiny (10 VMs or fewer)
- 2 vCPUs, 8GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, one 15GB HDD
- Supports a maximum of 750 witness components
- Medium (up to 500 VMs)
- 2 vCPUs, 16GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, one 350GB HDD
- Supports a maximum of 21,000 witness components
- Large (more than 500 VMs)
- 2 vCPUs, 32GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, three 350GB HDDs
- Supports a maximum of 45,000 witness components
Witness Appliance Networking
It is also important that you populate the correct and complete network settings of the witness appliance. Obviously this appliance will most likely sit on a physical ESXi host, so you will need to ensure that the networking on this host can communicate back to the data sites’ ESXi hosts. Once the witness appliance is deployed, customers should launch a console to the nested ESXi host through the DCUI, and populate the network details accordingly. The appliance has been shipped with some arbitrary hostname and DNS settings. These must be configured correctly with your environment, or it may cause some odd configuration issues later on. In a future version of the appliance, these will be cleaned up. I will also do a step-by-step witness deployment in a future post.
The witness appliance should now be added to vCenter as an ESXi host. Upon closer examination, you will see that it has its own virtual standard switches (called vSwitch0 and witnessSwitch respectively) and more importantly, witnessSwitch has a predefined portgroup called witness pg.
Each vSwitch has an uplink. The VMware guidance would be to use vSwitch0 for the management network and witnessSwitch for the VSAN network. The purpose of preconfiguring the network with a port group is so that the virtual machine network adapters MAC addresses matches the vmnic MAC address of the nested ESXi host. When these two MAC addresses match (on the inside and the outside so to speak), the vSwitch will pass the network traffic to the nested ESXi. When they do not match, this traffic will be dropped as the vSwitch does not know who the packets are intended for. Another way of resolving this is to use promiscuous mode, but when the inside and outside MAC addresses match, there is no need for a promiscuous mode setting on the virtual switch. This is not a concern for the first adapter on the appliance (the MACs always match), but it is necessary for all subsequent adapters. Do not delete this preconfigured witness portgroup, or the MAC addresses may not match when a new portgroup is created. If you delete it, you will need to enable promiscuous mode to allow communication. The recommendation would be to redeploy the appliance to avoid this.
One final note on networking; whilst VSAN traffic between the nodes in the data sites in this version of VSAN continues to have a multicast requirement, the witness traffic between the data nodes and the witness appliance is in fact unicast.
Identifying a Witness Appliance
One of the nice features of the witness appliance is that it is very easy to identify in the vCenter inventory. Witness appliance are shared in blue, to distinguish them from other ESXi hosts in the cluster.
No datastores have been configured warning
In the above screenshot, the witness does not contain any warnings. However out of the box, the witness will have a warning that states “No datastores have been configured”. Unfortunately, there is no supported way of disabling this warning at the present time. Again, this is something we will resolve going forward.
Replacing the witness host
If there is an issue with the witness host, then the witness components will go in an “absent” state. This will not impact the virtual machine as they continue to have a full copy of the data available from the data sites as well as greater than 50% of the components available. This means the virtual machine will stay accessible. If the witness host is unrecoverable, and a new witness host is deployed, the stretched cluster configuration can be recreated. Note however that the clomd timeout value still holds in this situation, so the components will be left absent for 60 minuted by default. After that timer has expired, the witness components will be rebuilt on the witness host and the “absent” witness components will return to an active state.
At GA, we will be releasing a VSAN Stretched Cluster Guide, which will describe the network topologies and configuration in more detail. It contains deployment instructions for the VSAN witness appliance, including all the configuration steps. If considering a VSAN stretched cluster deployment (or indeed a ROBO deployment), you should most definitely review this document before starting the deployment.