Physical or Virtual Witness
Now, rather than have customers dedicate a physical ESXi host to be a witness host, as well as consume a license for the witness host, VMware has developed the VSAN witness appliance (which is essentially an ESXi running in a VM) to take care of the witness requirements. This is achievable because a witness host requires a lot less capacity, bandwidth and performance when compared to hosts in regular VSAN clusters, or hosts on data sites in a VSAN stretched cluster. The purpose of the witness appliance is to store the virtual machine witness components, and when there is a failure or a split brain situation, it contributes towards the object quorum so that the virtual machine can remain available. This is a significant cost saving for customers who wish to implement a VSAN stretched cluster, or indeed a two node VSAN ROBO deployment. Even better, there are no licenses needed with the witness appliance, as it comes preconfigured with a license. Of course, if you wish to use a physical host for the witness, you can do that too.
- For stretched cluster, there is a 5ms RTT between data sites and a 100ms – 200ms RTT between the data sites and the witness, depending on the size of the configuration.
- For two node ROBO configurations, there is a 5ms RTT between data sites and a 500ms RTT between the data sites and the witness.
Witness Appliance Configurations
When deploying the witness appliance, there are a number of different configurations that can be chosen depending on the size of the environment. This all essentially boils down to the number of VMs, and thus the number of witness components that you expect to reside on the witness appliance. These are the different configurations, and they are chosen during appliance deployment. Note that the SSD is a VMDK tagged as an SSD. There is no requirement to use a flash device in the appliance.
- Tiny (10 VMs or fewer)
- 2 vCPUs, 8GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, one 15GB HDD
- Supports a maximum of 750 witness components
- Medium (up to 500 VMs)
- 2 vCPUs, 16GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, one 350GB HDD
- Supports a maximum of 21,000 witness components
- Large (more than 500 VMs)
- 2 vCPUs, 32GB vRAM
- 8GB ESXi Boot Disk, one 10GB SSD, three 350GB HDDs
- Supports a maximum of 45,000 witness components
Witness Appliance Networking
It is also important that you populate the correct and complete network settings of the witness appliance. Obviously this appliance will most likely sit on a physical ESXi host, so you will need to ensure that the networking on this host can communicate back to the data sites’ ESXi hosts. Once the witness appliance is deployed, customers should launch a console to the nested ESXi host through the DCUI, and populate the network details accordingly. The appliance has been shipped with some arbitrary hostname and DNS settings. These must be configured correctly with your environment, or it may cause some odd configuration issues later on. In a future version of the appliance, these will be cleaned up. I will also do a step-by-step witness deployment in a future post.
The witness appliance should now be added to vCenter as an ESXi host. Upon closer examination, you will see that it has its own virtual standard switches (called vSwitch0 and witnessSwitch respectively) and more importantly, witnessSwitch has a predefined portgroup called witness pg.
Identifying a Witness Appliance
One of the nice features of the witness appliance is that it is very easy to identify in the vCenter inventory. Witness appliance are shared in blue, to distinguish them from other ESXi hosts in the cluster.
No datastores have been configured warning
In the above screenshot, the witness does not contain any warnings. However out of the box, the witness will have a warning that states “No datastores have been configured”. Unfortunately, there is no supported way of disabling this warning at the present time. Again, this is something we will resolve going forward.
Replacing the witness host
If there is an issue with the witness host, then the witness components will go in an “absent” state. This will not impact the virtual machine as they continue to have a full copy of the data available from the data sites as well as greater than 50% of the components available. This means the virtual machine will stay accessible. If the witness host is unrecoverable, and a new witness host is deployed, the stretched cluster configuration can be recreated. Note however that the clomd timeout value still holds in this situation, so the components will be left absent for 60 minuted by default. After that timer has expired, the witness components will be rebuilt on the witness host and the “absent” witness components will return to an active state.
More information
At GA, we will be releasing a VSAN Stretched Cluster Guide, which will describe the network topologies and configuration in more detail. It contains deployment instructions for the VSAN witness appliance, including all the configuration steps. If considering a VSAN stretched cluster deployment (or indeed a ROBO deployment), you should most definitely review this document before starting the deployment.