2-node vSAN – witness network design considerations

It seems that 2-node vSAN for ROBO (remote office/branch office) deployments are becoming more and more popular. The fact that one can now connect the 2 vSAN hosts at the remote office directly back-to-back without needing a 10Gb switch has reduced the cost extensively. And with the introduction of a vSAN Enterprise for ROBO license edition with vSAN 6.6.1, you get the full feature set of vSAN on 2-node deployments. This new edition builds on the vSAN Advanced edition, and enables the use of features like native encryption and stretched clusters on a per-VM pricing model for smaller sites.

The purpose of this post is to describe some of the more common questions that we get these days about 2-node deployments, which is namely about networking between the remote site and the main DC back at HQ. In most cases, the main DC back at HQ will have the vCenter Server for managing the vSAN cluster as well as the witness appliance, so network connectivity is needed between the HQ and remote site for both the vSAN network and the management network.

These days, 2-node ROBO implementations typically implement the separation of the witness traffic from the vSAN traffic. This means that the vSAN data only flows over the direct connect VMkernel interfaces between the 2 vSAN nodes at the remote site, while the witness traffic (which is minimal) is routed back to the witness appliance residing in the main DC back at HQ via another interface. This WTS – short for Witness Traffic Separation – is discussed in more detail in the vSAN Stretched Cluster and 2-node guide. Duncan has already written an article answering another common question about whether all witness traffic are able to share the same VLAN between remote sites and HQ if required (the answer is yes).

Duncan also states something else that is important to know for ROBO deployments and witness traffic; ROBO locations must send the witness traffic over L3. Another thing to note is that there is no multicast used for witness traffic. Witness traffic always has been unicast. So no need to worry about routing multicast with PIM, etc.

In the layout diagram taken from his post, all of the remote sites have unique VLANs for both the management network and the witness network. These network are L3, and require static routes created.

Another setup may be where the management network is part of a stretched L2 network, that can reach the remote sites. It’s quite unlikely this will be the case when you have a lot of remote sites, but it might be something you may come across when there is only one, or a very small number, of remote sites. If you have multiple subnets routed to each site for the witness traffic, you may have a configuration which looks like this.

Similarly, if the remote sites have a single routed subnet, the witness networks may all have to share the same VLAN once more, as Duncan mentioned in his post referenced earlier. This may look something like the following:

Of course, here I have drawn the management network with their own separate VLANs per site. But like in the previous topologies, the management network could again be a stretched L2.

Now there is another scenario that we get quite a few questions on, and this is customers may only have a single subnet routed between the main DC and the remote sites.  In this case, the management network of the hosts may be tagged with the witness traffic, which is very light anyway. So in this example, the witness traffic and the management traffic share the same VLAN, and only one subnet is routed. Back at the main DC back at HQ, the witness appliance may be modified so that a single interface can be used for both management and witness traffic. This is quite straight forward to do, and means that the original portgroup created at deployment time for vSAN traffic on the witness appliance may be removed.

We have spoken about this to our engineering and product management teams to make sure there are no support issues around this approach, and they have confirmed to us that there are no issues with this design, should a customer wish to implement it. We are now updating our official docs to call this out as a supported configuration for 2-node / ROBO vSAN deployments.

10 comments
    • This configuration is designed to tolerate 1 failure. VMs deployed on 2-node have 1st copy of the data on 1st node, 2nd copy of the data on 2nd node, and witness component/metadata on witness appliance back at HQ.
      Using a ‘quorum’ based mechanism, we need 2 out of 3 component available for the VM/data to remain accessible. If we loose the witness/connection between remote site and hq, the VMs still have a majority/quorum with the 2 copies of the data, so the VM remains accessible.

  1. We want to build a 2node vSAN (with 25G direct connect) with 2 pure NVMe Server. We want to use pure 7mm high (with DeDupe and compression enabled). I have one question: Why must a vSAN have in this constellation have a cache disc with 10% of disc group? Can be dispensed with? I don’t see any perfomance benefit…

    • That’s simply the design Michael. Its been like this since the initial vSAN hybrid 5.5 days. Having said that, we are looking at ways to remove this requirement going forward, but I do not have any dates or timelines, nor can I confirm if we will actually implement this.

  2. Greetings, for lab purposes , witness traffic can be traversed stretched L2 vlan similar to management network, or in other words, I don’t have L3 connectivity and the witness has been deployed on a host that has L2 connectivity with the 2-node of the vSAN cluster.

  3. Is it possible to build a L3 Design that is Switchless/Routerless for the witness Traffic separation/Site with direct connect to the 2node vSAN Hosts? Is there a configuration example?

  4. How much bandwidth and resources does a witness appliance use? it may be small, but need to know specifics when designing a central witness ‘repository’ for 500 remote sites of less than 10 VMs each.

Leave a Reply