New vSAN Stretched Cluster Topology now supported

After publishing the vSAN Networking Guide earlier this year, Paudie O’Riordain and I received numerous questions regarding support on having different stretched clusters hosting each other’s witness appliances. These queries arose because we discussed a 2-node (ROBO) topology which allowed this to sort of configuration (i.e. the cross hosting of witnesses) via special request. But in the networking guide, we explicitly stated that this was not supported with vSAN stretched clusters. However, after some further testing by our engineering teams, we can now relax this restriction if there are 4 independent sites hosting the different stretched clusters. In this case,…

Erasure Coding and Quorum on vSAN

I was looking at the layout of RAID-5 object configuration the other day, and while these objects were deployed on vSAN with 4 components, something caught my eye. It wasn’t the fact that there were 4 components, which is what one would expect since we implement RAID-5 as a 3+1, i.e. 3 data segments and 1 parity segment. No, what caught my eye was that one of the components had a different vote count. Now, RAID-5 and RAID-6 erasure coding configurations are not the same as RAID-1. With RAID-1, we deploy multiple copies of the data depending on how many…

What’s new in vSAN 6.6?

vSAN 6.6 is finally here. This sixth iteration of vSAN is the quite a significant release for many reasons, as you will read about shortly. In my opinion, this may be the vSAN release with the most amount of new features. Let’s cut straight to the chase and highlight all the features of this next version of vSAN. There is a lot to tell you about. Now might be a good time to grab yourself a cup of coffee.

Another recovery from multiple failures in a vSAN stretched cluster

In a previous post related to multiple failures in a vSAN stretched cluster, we showed that if a failure caused the data components to be out of sync, the most recent copy of the data needs to recover before the object becomes accessible again. This is true even if there are a majority of objects available (e.g. old data copy and witness). This is to ensure that we do not recover the “STALE” copy of the data which might have out of date information. To briefly revisit the previous post,  the accessibility of the object when there are multiple failures…

Understanding recovery from multiple failures in a vSAN stretched cluster

Sometime back I wrote an article that described what happens when an object deployed on a vSAN datastore has a policy of Number of Failures to Tolerate set to 1 (FTT=1), and multiple failures are introduced. For simplicity, lets label the three components that make up our object with FTT=1 as A, B and W. A and B are data components and W is the witness component. Let’s now assume that we lose access to component A. Components B & W are still available, and the object (e.g. a VMDK) is still available. The state of these two components (B…

vSAN Stretched Cluster – Partition behavior changes

My good pal Paudie and I are back in full customer[0] mode these past few weeks, testing out lots of new and upcoming features in future release of vSAN. Our testing led us to building a new vSAN stretched cluster, with 5 nodes on the preferred site, 5 nodes on the secondary site, and of course the obligatory witness node. Now, it had been a while since we put vSAN stretched cluster through its paces. The last time was the vSAN 6.1 release, which led us to create some additional sections on stretched cluster for the vSAN Proof Of Concept…