2-node vSAN topologies review
There has been a lot of discussion in the past around supported topologies for 2-node vSAN, specifically around where we can host the witness. Now my good pal Duncan has already highlighted some of this in his blog post here, but the questions continue to come up about where I can, and where I cannot place the witness for a 2-node vSAN deployment. I also want to highlight that many of these configuration considerations are covered by our official documentation. For example, there is the very comprehensive VMware Virtual SAN 6.2 for Remote Office and Branch Office Deployment Reference Architecture which talks about hosting the witness back in a primary data center, as well as another Reference Architecture document which covers Running VMware vSAN Witness Appliance in VMware vCloud Air. So considering all of the above, let’s look at some topologies that are supported with 2-node vSAN deployments, and one which ones are not:
Witness running in the main DC
In this full example, we fully support having the witness (W) run remotely on another vSphere site, such as back in your primary datacenter. This is covered in detail in the VMware Virtual SAN 6.2 for Remote Office and Branch Office Deployment Reference Architecture mentioned earlier.
Witness running in vCloud Air
In this next example, we fully support having the witness (W1) run remotely in vCloud Air. This is covered in detail in the Running VMware vSAN Witness Appliance in VMware vCloud Air Reference Architecture mentioned earlier.
Witness running on another standard vSAN deployment
Now this one is interesting. A common question is whether or not one can run the witness (W) on a vSAN deployment back on the main DC. The answer is yes, this is fully supported. The crux of the matter, as stated by the vSAN Lead Engineer Christian Dickmann, is that “We support any vSphere to run the witness that has independent failure properties”. So in other words, any failure on the 2-node vSAN at the remote site will not impact the availability of the standard vSAN environment at the main DC.
Witness running on another 2-node vSAN deployment, and vice-versa
This final configuration is the one which Duncan has described in detail on his post, so I won’t go into it too much. Suffice to say that this configuration breaks the guidance around “We support any vSphere to run the witness that has independent failure properties.” In this case there is an inter-dependency between the 2-node vSAN deployments at each of the remote sites, as each site hosts the witness of the other 2-node deployment (W1 is the witness for the 2-node vSAN deployment at remote site 1, and W2 is the witness for the 2-node vSAN deployment at remote site 2). Thus if one site has a failure, it impacts the availability of the other site. [Update] As of March 16th, 2017, VMware has change our stance with around this configuration. We will now support this through our RPQ process. There are several constraints with this deployment and customers need to fully understand and agree with those for us to approve the RPQ. So we will change this to not recommended, but supported via RPQ. The vSAN team will selectively approve which deployments to support, so don’t assume that just because you submit an RPQ, you will be supported.
Hope this helps clarify the support around the different 2-node topologies, especially for witness placement.
Licensing
There is one final topic that I wish to bring up with 2-node + witness deployments, and that is around licensing. Note that even though the witness is an appliance, it is an ESXi host running in a VM. And although we supply a license with the appliance, it will still consume a license in vCenter when it comes to management. For example, say you deploy a 2-node vSAN. The 2-node vSAN will need 2 ESXi hosts at the remote site, but there may be a 3rd physical server that could be used for hosting vCenter as well as the witness appliance. If you are using a vSphere Essentials license, you will not be able to add the witness appliance as vSphere Essentials can only manage 3 hosts. There is some discussion about this internally at VMware at the moment, but as of right now, this is a restriction that you may encounter with vSphere Essentials.
So does the unsupported option also count for “normal” stretched VSAN clusters?
We have a customer project like this: Two 4 host stretched clusters (2 hosts room A/2 hosts room B) on two separated sites, each running the witness appliance of the other site. In case of a failure of one host or room on one site, the witness appliance would be restarted by HA within that cluster. So redundancy for the other site would be restored within minutes.
When designing this solution for out customer, we got the confirmation from VMware, that this is a supported configuration.
So, now it would be interesting if we are in an unsupported state…
I could not say definitively – I have not come across such a design to be honest.
However, if you have received an official VMware support statement, then I guess you are good to go. It might be worth checking once more with your VMware contact however, and verify that they understand exactly your implementation. It might also be worthwhile highlighting this post to your VMware contacts.
So Johannes – I’ve just recently got an update on this, and cross hosting the witness in stretched cluster is “not” supported. Feel free to have your VMware rep contact me if there is any outstanding questions.
Thanks Cormac for the update. Actually, a bad update for our customer, since we have such a concept in place (and in rollout) and it was confirmed by VMware Germany a year ago.
Is their any explanation for not supporting this concept?
I mean, there is enough redundancy in place with such a concept. I would like to have a clear understanding for this since this concept could have benefits for many of our customers (as our company faces special use cases with our customers in the mission-critical business).
I have seen that you have updated your blog: Would a RPQ be possible for our case?
Please speak to your local VMware contacts Johannes – they will need to raise this directly with the storage business unit.
I am on it… thanks!
Good study
Is there any progress that you are aware of concerning the essentials 3 node licensing issue?
I am not aware of any – let me ask again internally.
Still working on a solution Martin – none available just yet, but hopefully soon.
Thx!
Good to hear that they are working on a solution. Will it be in the way of supplying another license, or by means of changing the detection method of the number of hosts?
TBD.
ack 😉