A closer look at the VSAN witness appliance

vsan_stretch_graphic_v02_300dpi_01_square140As part of the Virtual SAN 6.1 announcements at VMworld 2015, VMware announced two new, eagerly anticipated features. The first of these is VSAN stretched cluster, allowing you to protect your virtual machines across data centers, not just racks. And the second is 2-node VSAN, which will be an excellent solution for remote office/branch office (ROBO) configurations. To allow these configuration to work, a dedicated witness host is required. For those of you already familiar with VSAN,  a witness component is used in the event of a split brain to figure out if the virtual machine objects have a quorum. If there is more than 50% of the VSAN’s object is still available, and there is a fully copy of the data object, then the VM’s object remains available. Witnesses contribute towards this “greater than” 50% figure. You can read more about the witness here.

Physical or Virtual Witness

Now, rather than have customers dedicate a physical ESXi host to be a witness host, as well as consume a license for the witness host, VMware has developed the VSAN witness appliance (which is essentially an ESXi running in a VM) to take care of the witness requirements. This is achievable because a witness host requires a lot less capacity, bandwidth and performance when compared to hosts in regular VSAN clusters, or hosts on data sites in a VSAN stretched cluster. The purpose of the witness appliance is to store the virtual machine witness components, and when there is a failure or a split brain situation, it contributes towards the object quorum so that the virtual machine can remain available. This is a significant cost saving for customers who wish to implement a VSAN stretched cluster, or indeed a two node VSAN ROBO deployment. Even better, there are no licenses needed with the witness appliance, as it comes preconfigured with a license. Of course, if you wish to use a physical host for the witness, you can do that too.

1. introYou should note however that there are some rules governing the witness appliance. A witness appliance cannot be shared between configurations; it is a 1:1 relationship with a stretched cluster or with a ROBO configuration. The bandwidth and latency requirements between the witness site and the data site must also be met.

  • For stretched cluster, there is a 5ms RTT between data sites and a 100ms – 200ms RTT between the data sites and the witness, depending on the size of the configuration.
  • For two node ROBO configurations, there is a 5ms RTT between data sites and a 500ms  RTT between the data sites and the witness.

Witness Appliance Configurations

When deploying the witness appliance, there are a number of different configurations that can be chosen depending on the size of the environment. This all essentially boils down to the number of VMs, and thus the number of witness components that you expect to reside on the witness appliance. These are the different configurations, and they are chosen during appliance deployment. Note that the SSD is a VMDK tagged as an SSD. There is no requirement to use a flash device in the appliance.

  • Tiny (10 VMs or fewer)
    • 2 vCPUs, 8GB vRAM
    • 8GB ESXi Boot Disk, one 10GB SSD, one 15GB HDD
    • Supports a maximum of 750 witness components
  • Medium (up to 500 VMs)
    • 2 vCPUs, 16GB vRAM
    • 8GB ESXi Boot Disk, one 10GB SSD, one 350GB HDD
    • Supports a maximum of 21,000 witness components
  • Large (more than 500 VMs)
    • 2 vCPUs, 32GB vRAM
    • 8GB ESXi Boot Disk, one 10GB SSD, three 350GB HDDs
    • Supports a maximum of 45,000 witness components

Witness Appliance Networking

It is also important that you populate the correct and complete network settings of the witness appliance. Obviously this appliance will most likely sit on a physical ESXi host, so you will need to ensure that the networking on this host can communicate back to the data sites’ ESXi hosts.  Once the witness appliance is deployed, customers should launch a console to the nested ESXi host through the DCUI, and populate the network details accordingly. The appliance has been shipped with some arbitrary hostname and DNS settings. These must be configured correctly with your environment, or it may cause some odd configuration issues later on. In a future version of the appliance, these will be cleaned up. I will also do a step-by-step witness deployment in a future post.

The witness appliance should now be added to vCenter as an ESXi host. Upon closer examination, you will see that it has its own virtual standard switches (called vSwitch0 and witnessSwitch respectively) and more importantly, witnessSwitch has a predefined portgroup called witness pg.

VSAN witness networkEach vSwitch has an uplink. The VMware guidance would be to use vSwitch0 for the management network and witnessSwitch for the VSAN network. The purpose of preconfiguring the network with a port group is so that the virtual machine network adapters MAC addresses matches the vmnic MAC address of the nested ESXi host. When these two MAC addresses match (on  the inside and the outside so to speak), the vSwitch will pass the network traffic to the nested ESXi. When they do not match, this traffic will be dropped as the vSwitch does not know who the packets are intended for. Another way of resolving this is to use promiscuous mode, but when the inside and outside MAC addresses match, there is no need for a promiscuous mode setting on the virtual switch.  This is not a concern for the first adapter on the appliance (the MACs always match), but it is necessary for all subsequent adapters. Do not delete this preconfigured witness portgroup, or the MAC addresses may not match when a new portgroup is created. If you delete it, you will need to enable promiscuous mode to allow communication. The recommendation would be to redeploy the appliance to avoid this.

promiscuous modeOne final note on networking; whilst VSAN traffic between the nodes in the data sites in this version of VSAN continues to have a multicast requirement, the witness traffic between the data nodes and the witness appliance is in fact unicast.

Identifying a Witness Appliance

One of the nice features of the witness appliance is that it is very easy to identify in the vCenter inventory. Witness appliance are shared in blue, to distinguish them from other ESXi hosts in the cluster.

witness applianceNote however this does not apply to physical ESXi hosts used as witnesses. This only occurs with the witness appliance.

No datastores have been configured warning

In the above screenshot, the witness does not contain any warnings. However out of the box, the witness will have a warning that states “No datastores have been configured”. Unfortunately, there is no supported way of disabling this warning at the present time. Again, this is something we will resolve going forward.

Replacing the witness host

If there is an issue with the witness host, then the witness components will go in an “absent” state. This will not impact the virtual machine as they continue to have a full copy of the data available from the data sites as well as greater than 50% of the components available. This means the virtual machine will stay accessible. If the witness host is unrecoverable, and a new witness host is deployed, the stretched cluster configuration can be recreated. Note however that the clomd timeout value still holds in this situation, so the components will be left absent for 60 minuted by default. After that timer has expired, the  witness components will be rebuilt on the witness host and the “absent” witness components will return to an active state.

More information

At GA, we will be releasing a VSAN Stretched Cluster Guide, which will describe the network topologies and configuration in more detail. It contains deployment instructions for the VSAN witness appliance, including all the configuration steps. If considering a VSAN stretched cluster deployment (or indeed a ROBO deployment), you should most definitely review this document before starting the deployment.

32 comments
  1. Hi Cormac

    Good article explaining the different aspects of the VSAN witness host. So given the witness is an actual ESXi hosts I take it we need to patch and monitor it in the same way as every other ESXi host? I am also assuming here the patch version of the witness needs to b the same as the other physical VSAN hosts.

    Thanks
    David

  2. Just a question: having just a 3 HOSTS VSAN 6.0 cluster, upgrading to 6.1, could I add a witness phisical appliance (there is an HCL for this or standard vmware HCL is enough?) in order to obtain a 4-like HOSTS cluster, so I can put a host in manteinance mode and continue to have the FTT=1 and this without buying any other license?

    • Not possible I’m afraid. Stretched Cluster/ROBO uses an algorithm to always place witness components on the witness host. That functionality is not there on standard VSAN deployments.

        • So you wish to move to 3 physical nodes, rather than 2 physical nodes and a witness appliance?

          The only major difference is the use of fault domains. While I haven’t gone through the steps, I can’t see anything that would block this.

          Have you tried to do this yourself? Have you encountered any blockers? I’d be interested to know.

          • Yes let’s say i eventually do need more resources and i have to make it a 3-node setup.

            I have not tried it yet. Your note about a specific algorithm for ROBO setup regarding witness component placement made me think the ROBO setup was possibly more of a static configuration currently.

          • I think I understand what you mean now.

            The main thing to keep in mind is that when I am talking about ROBO, I am talking about 2-node VSAN deployment with a witness appliance. This configuration, which is identical to VSAN stretch clustering, can be achieved with VSAN standard license.

            Once you go above the 2-node VSAN stretched cluster deployment with a witness appliance, you would require the VSAN advanced license. The standard license will not allow you to do this configuration with more than 2 nodes.

            So really, if you want to do a 3-node ROBO deployment with a VSAN standard license, you would not be able to use the witness appliance. In this case, the witness components would be spread across all three nodes in your configuration.

            Hope that makes sense.

  3. Thanks for a very informative post. One thing I feel is missing that may not be obvious to everyone even though “FTT=1” is stated – there is no data redundancy within a fault domain. If you lose a site (power, flood, network, whatever), then you are vulnerable to data loss should a single disk in the remaining site fail. Looking forward to a future release providing some raid redundancy within a fault domain for stretch clusters!

  4. Is it possible to have the witness node and the 2 data nodes at the same site? Everything I’m reading about ROBO shows the witness node at a separate site or centralized site. Is there a hard licensing requirement to prevent from having all three at the same geo site and having only two data nodes with the witness appliance all existing at the same site on the same layer 2 network?

    • Yes – you can certainly group all of the together on the same site if you wish. However you will need a third host at that site for the witness (which can be a physical host or an appliance). If it is an appliance, it will still need to be deployed on an ESXi host. This is why most of the diagrams are showing the witness deployed as an appliance, but placed on an ESXi host back at the same DC. You can now have multiple ROBO sites, each with their witness deployed back in the main DC on one ESXi host. We are looking at alternative ways of hosting the witness appliance (workstation, fusion, vCloud Air) but at the moment it is only supported on ESXi.

  5. So let’s say I want to deploy a single 2 hosts VSAN, not in a Remote Office or Branch Office. I do not have a dedicated centralized vcenter. I would still have to use a 3rd server running ESXi or an appliance as the witness to have VSAN ROBO works correctly. What would be my options to use VSAN ROBO as a simple 2 hosts VSAN isolated installation ?

    • You mean that you would like to run 2-hosts in a VSAN cluster using a witness appliance? I don’t think that is an issue. Both a VSAN standard license and a VSAN ROBO license (limiting to 25 VMs) would allow you to do this. But yes, you would need a third ESXi to run either (a) as your physical witness host or (b) run the virtual witness appliance (which is ESXi in a VM).

      • Thank you for your answer. As it is, there is no way to run a 2-hosts VSAN cluster without a 3rd server as a witness. That witness cannot reside as a VM on the 2-hosts VSAN cluster.

        Right ? I know these is an obvious question but I need confirmation 🙂

  6. Hello,

    Regarding to ROBO environment, as 10G is the recommendation for vSAN, is a must to extend the 10G network from the remote office to the central office? Management would go through 1G, but …it´s not that common to extend a 10G network.

    Regards,

    • Hi Eloy – no, there is no need for 10G back to the main/central office. I assume the plan is to manage the ROBO deployment from the central office, as well as locate the witness in the central office. The requirements for the witness when we have around 5-10 VMs at the remote site is only 1.5Mbps. I think our tech marketing team are working on a VSAN ROBO white paper which will detail all of this.

      • Thanks Cormac. Actually we want to deploy some more VMs, up to 30 so we consider to have 10G network in the ROBO site with vSAN. Our doubt comes up with the witness appliance, can it be joined through 1G network? As you say there are 2 vswitches, one for management an another one to link with vSAN stuff. Is this scenario possible?

    • Hi,

      i like to add the question: can we use a crosslink 10Gb network connection at the robo site or is using a switch mandatory since the witness site has to use the same (vSAN-) network?

  7. Thank you so much for the great article that explains in detail about the witness role. My question is ..If I am doing the Nested Witness for my ROBO deployment, will I be able to use the local storage of that ESXi host where the nested VM resides ?. To be more clear ..will the the Nested ESXi prevents any VM being deployed ?

    • Nope – the physical ESXi can continue to be used as a normal ESXi server, and other VMs alongside the nested witness appliance can happily run there. You cannot use the nested ESXi (witness appliance) to run VMs – that is not supported.

  8. Hi, there’s any prerequisite for the physical host that contains the witness appliance. Can it be just a free ESXI hypervisor with just one internal HDD? (and the nics of course)

  9. Could you run the vCenter Server Appliance for the VSAN cluster on the 3rd physical host ( the one hosting the witness appliance ) ?

  10. Hi Cormac
    What is the upgrade path when going from VSAN 5.5 to latest VSAN with the stretched cluster functionality?
    Do I upgrade the VSAN to 6 using the filesystem & component upgrades, then install the witness on my third site followed by the creation of the fault domains, assign the hosts to sites and finally select the witness?

    • You will have to upgrade VC and ESXi from 5.x to 6.x, then do the rolling upgrade of the on-disk format. Once that is complete, you can begin work on implementing the stretched cluster.

Comments are closed.