A deeper-dive into Fault Tolerant VMs running on vSAN

After receiving a number of queries about vSphere Fault Tolerance on vSAN over the past couple of weeks, I decided to take a closer look at how Fault Tolerant VMs behave with different vSAN policies. I wanted to take a look at two different policies. The first is when the “failures to tolerate” (commonly referred to as FTT) is set to 0, and the other is when the “failures to tolerate” is set to 1. The question is whether or not we could deploy VMs without any vSAN protection and allow Fault Tolerant VMs to protect them instead.

Which policy changes can trigger a rebuild on vSAN?

Some time ago, I wrote about which policy changes can trigger a rebuild of an object. This came up again recently, as it was something that Duncan and I covered in our VMworld 2017 session on top 10 vSAN considerations. In the original post (which is over 3 years old now), I highlighted items like increasing the stripe width, growing the read cache reservation (relevant only to hybrid vSAN) and changing FTT when the read cache reservation is non-zero (again only relevant to hybrid vSAN) which led to a rebuild of the object (or components within the object). The other…

How many hosts are needed to implement SFTT in vSAN Stretched Cluster?

Many of you who are well versed in vSAN will realize that we released a Secondary Failures To Tolerate (SFTT) feature with vSAN 6.6. This meant that not only could we tolerate failures across sites, but that we could also add another layer of redundancy to each copy of the data maintained at each of the data sites. Of course the cross site replication (now referred to as PFTT or Primary Failures To Tolerate) is still based on RAID-1 mirroring and this continues to require a third site for the witness appliance, so that quorum can be obtained in the…

Debunking some behavior “myths” in 3 node vSAN cluster

I recently noticed a blog post describing some very strange behaviors in 2-node and 3-node vSAN clusters. I was especially concerned to read that when they introduced a failure and then fixed that failure, they did not experience any auto-recovery. I have reached out to the authors of the post, just to check out some things such as version of vSAN, type of failure, etc. Unfortunately I haven’t had a response as yet, but I did feel compelled to put the record straight. In the following post, I am going to introduce a variety of operations and failures in my…

VSAN 6.0 Part 8 – Fault Domains

One of the really nice new features of VSAN 6.0 is fault domains. Previously, there was very little control over where VSAN placed virtual machine components. In order to protect against something like a rack failure, you may have had to use a very high NumberOfFailuresToTolerate value, resulting in multiple copies of the VM data dispersed around the cluster. With VSAN 6.0, this is no longer a concern as hosts participating in the VSAN Cluster can be placed in different failure domains. This means that component placement will take place across failure domains and not just across hosts. Let’s look…

VSAN Part 36: Considerations when using Force Provisioning

One policy setting that I have yet to discuss in any great detail in my blog posts about VSAN. The ForceProvisioning policy setting, when placed in the VM Storage Policy, allows Virtual SAN to violate the NumberOfFailuresToTolerate (FTT), NumberOfDiskStripesPerObject (SW) and FlashReadCacheReservation (FRCR) policy settings during the initial deployment of a virtual machine. This can be useful for many reasons. One reason is that it enables the boot-strapping of a vCenter server on a VSAN deployment as highlighted by William Lam in this excellent blog post on the subject. Another reason is that it allows the provisioning of virtual machines…