I spent the last 10 days in the VMware HQ in Palo Alto, and had lots of really interesting conversations and meet-ups, as you might imagine. One of those conversations revolved around the minimum VSAN configurations. Let’s start with the basics.
- 2-node: There are two physical hosts for data and a witness appliance hosted elsewhere. Data is placed on the physical hosts, and the witness appliance holds the witness components only, never any data.
- 3-node: There are three physical hosts, and the data and witness components are distributed across all hosts. This configuration can support a number of failures to tolerate = 1 with RAID-1 configurations.
- 4-nodes: There are four physical hosts, and the data and witness components are distributed across all hosts. This configuration can support a number of failures to tolerate (FTT) = 1 with RAID-1 and RAID-5 configurations. This configuration also allows VSAN to self-heal in the event of a failures, when RAID-1 is used.
Let’s elaborate on that last point. What do we mean by self-healing? What we mean by this is that if a host (or some other infrastructure) fails, and there are free resources available in the cluster, VSAN can automatically fix the issue. With RAID-1 and FTT=1, there are 3 components: first copy of the data, second copy of the data and the witness. These are all located on separate hosts. If one of the hosts fails, then the component on that host can be rebuilt, and the VM is once again fully protected against a failure.
In the case of a 2-node or 3-node cluster, this is not possible. If a host fails, there is no place to rebuild the missing component. We do not place both copies of the data, or the data and a witness, on the same host – that is pointless, since if that host fails we have lost a quorum of components, and thus access to the VM’s object. This is one of the reasons VMware recommends a 4-node cluster at minimum for FTT=1, RAID-1. Note however that if RAID-5 configurations are chosen, you cannot have this self-healing behaviour with 4 nodes. Since a RAID-5 stripe is made up of 4 components, each component is placed on a different host. To have self-healing with RAID-5, you would need a minimum of 5 hosts in the cluster.
There is another reason for this recommendation of minimum hosts, and this is to do with maintenance mode. When placing a VSAN node into maintenance mode, you typically choose between “full data migration” and “ensure accessibility”. With “full data evacuation”, all of the components (data and witnesses) on the host that is being placed into maintenance mode are rebuilt on remaining nodes. This once again ensures that your VMs are fully protected when the maintenance operation is taking place, and they can survive another failure in the cluster even when a host is in maintenance mode. With FTT=1 and RAID-1, you once again need a minimum of 4 hosts to achieve this. With FTT=1 and RAID-5, you need a minimum of 5 hosts.
The other option, “ensure accessibility”, is the only option that can be used with 2-node and 3-node configurations since once again there is no place to rebuild the components. In these cases, “ensure accessibility” simply means that you have a single copy of the VM data available. For example, a VM deployed with FTT=0 residing on the host that is being placed into maintenance mode would have that component rebuilt on another remaining node. For VMs deployed with FTT=1, there should be no need to move any data if “ensure accessibility” is the option. However you are now running with only 2 out of 3 components, and another failure whilst a host is in maintenance mode can render your VMs inaccessible.
Hopefully this explains why we make some minimum recommendations on the number of hosts in a VSAN cluster. While you are fully supported with 2-node and 3-node configurations, there are definitely some availability considerations when using these minimum configurations.