NetworkingOctober 6, 2017October 6, 2017

2-node vSAN – witness network design considerations

It seems that 2-node vSAN for ROBO (remote office/branch office) deployments are becoming more and more popular. The fact that one can now connect the 2 vSAN hosts at the remote office directly back-to-back without needing a 10Gb switch has reduced the cost extensively. And with the introduction of a vSAN Enterprise for ROBO license edition with vSAN 6.6.1, you get the full feature set of vSAN on 2-node deployments. This new edition builds on the vSAN Advanced edition, and enables the use of features like native encryption and stretched clusters on a per-VM pricing model for smaller sites. The…

stretched clusterSeptember 5, 2017September 4, 2017

New vSAN Stretched Cluster Topology now supported

After publishing the vSAN Networking Guide earlier this year, Paudie O’Riordain and I received numerous questions regarding support on having different stretched clusters hosting each other’s witness appliances. These queries arose because we discussed a 2-node (ROBO) topology which allowed this to sort of configuration (i.e. the cross hosting of witnesses) via special request. But in the networking guide, we explicitly stated that this was not supported with vSAN stretched clusters. However, after some further testing by our engineering teams, we can now relax this restriction if there are 4 independent sites hosting the different stretched clusters. In this case,…

erasure codingApril 17, 2017April 17, 2017

Erasure Coding and Quorum on vSAN

I was looking at the layout of RAID-5 object configuration the other day, and while these objects were deployed on vSAN with 4 components, something caught my eye. It wasn’t the fact that there were 4 components, which is what one would expect since we implement RAID-5 as a 3+1, i.e. 3 data segments and 1 parity segment. No, what caught my eye was that one of the components had a different vote count. Now, RAID-5 and RAID-6 erasure coding configurations are not the same as RAID-1. With RAID-1, we deploy multiple copies of the data depending on how many…

StorageApril 13, 2017June 22, 2021

Why two stretched clusters cannot cross-host each others witness appliances!

Sorry about the wordy title but this is a question that has come up a number of times. The request is that when I have two data sites and deploy two stretched clusters across these data sites, can the other stretched cluster support the witness appliance for this stretched cluster, and vice-versa? The answer is no, and I will explain why in this post.

StorageMarch 10, 2017April 12, 2017

2-node vSAN topologies review

There has been a lot of discussion in the past around supported topologies for 2-node vSAN, specifically around where we can host the witness. Now my good pal Duncan has already highlighted some of this in his blog post here, but the questions continue to come up about where I can, and where I cannot place the witness for a 2-node vSAN deployment. I also want to highlight that many of these configuration considerations are covered by our official documentation. For example, there is the very comprehensive VMware Virtual SAN 6.2 for Remote Office and Branch Office Deployment Reference Architecture…

stretched clusterFebruary 9, 2017February 9, 2017

Another recovery from multiple failures in a vSAN stretched cluster

In a previous post related to multiple failures in a vSAN stretched cluster, we showed that if a failure caused the data components to be out of sync, the most recent copy of the data needs to recover before the object becomes accessible again. This is true even if there are a majority of objects available (e.g. old data copy and witness). This is to ensure that we do not recover the “STALE” copy of the data which might have out of date information. To briefly revisit the previous post, the accessibility of the object when there are multiple failures…

stretched clusterFebruary 8, 2017February 8, 2017

Understanding recovery from multiple failures in a vSAN stretched cluster

Sometime back I wrote an article that described what happens when an object deployed on a vSAN datastore has a policy of Number of Failures to Tolerate set to 1 (FTT=1), and multiple failures are introduced. For simplicity, lets label the three components that make up our object with FTT=1 as A, B and W. A and B are data components and W is the witness component. Let’s now assume that we lose access to component A. Components B & W are still available, and the object (e.g. a VMDK) is still available. The state of these two components (B…