VSAN 6.0 Part 1 – New quorum mechanism
vSphere 6.0 released yesterday. It included the new version of Virtual SAN – 6.0. I now wish to start sharing some of the new features and functionality with you. One of things we always enforced with version 5.5 was the fact that when you deployed a VM with NumberOfFailuresToTolerate = 1, you always had at least 3 components: 1st copy of the data, 2nd copy of the data, and then a witness component for quorum. In version 5.5, for a VM to remain accessible, “one full copy of the data and more than 50% of components must be available”. We have introduced some subtle differences around quorum and VM accessibility to version 6.0.
In this example, I have a four node VSAN cluster, and each node has 3 magnetic disks. I have deployed a VM that has a NumberOfDiskObjectsToStripe = 6. This should place a component on every disk in the cluster. Once I deployed the VM, and checked the layout of the VMDK, I see the following:
Note that this view has changed to the Monitor tab in 6.0; it used to be in the Manage tab previously. What is interesting is that you don’t see a witness component with this configuration. This is the major change in quorum in VSAN 6.0. Now each component has a number of “votes”, which may be one or more. To achieve quorum in Virtual SAN 6.0, “more than 50 percent of votes” rather than “more than 50 percent of components” is needed. You can use the RVC command vsan.vm_object_status to show the vote count for each component:
> vsan.vm_object_info 3 VM test-sw=6: Namespace directory DOM Object: 32ca0255-6a56-981f-3542-005056030a0b (v2, owner: esx-01a.corp.local, policy: hostFailuresToTolerate = 1, stripeWidth = 1, spbmProfileId = 998e8bcb-f96e-4e82-a6cb-3204100aa715, proportionalCapacity = [0, 100], spbmProfileGenerationNumber = 1) RAID_1 Component: 33ca0255-2860-9197-6c1b-005056030a0b (state: ACTIVE (5), host: esx-02a.corp.local, md: mpx.vmhba1:C0:T0:L0, ssd: mpx.vmhba0:C0:T0:L0, votes: 1, usage: 0.3 GB) Component: 33ca0255-4c85-9497-367f-005056030a0b (state: ACTIVE (5), host: esx-01a.corp.local, md: mpx.vmhba4:C0:T0:L0, ssd: mpx.vmhba2:C0:T0:L0, votes: 1, usage: 0.3 GB) Witness: 33ca0255-5c41-9697-50e5-005056030a0b (state: ACTIVE (5), host: esx-04a.corp.local, md: mpx.vmhba4:C0:T0:L0, ssd: mpx.vmhba3:C0:T1:L0, votes: 1, usage: 0.0 GB) Disk backing: [vsanDatastore] xxx-xxx-xx-xxx/test-sw_6.vmdk DOM Object: 65ca0255-b095-bc34-757c-005056030a0b (v2, owner: esx-03a.corp.local, policy: spbmProfileGenerationNumber = 1, stripeWidth = 6, spbmProfileId = 998e8bcb-f96e-4e82-a6cb-3204100aa715, hostFailuresToTolerate = 1) RAID_1 RAID_0 Component: 66ca0255-a632-c8ba-8a21-005056030a0b (state: ACTIVE (5), host: esx-04a.corp.local, md: mpx.vmhba4:C0:T1:L0, ssd: mpx.vmhba3:C0:T0:L0, votes: 2, usage: 0.0 GB) Component: 66ca0255-f53a-cbba-c71a-005056030a0b (state: ACTIVE (5), host: esx-04a.corp.local, md: mpx.vmhba4:C0:T0:L0, ssd: mpx.vmhba3:C0:T1:L0, votes: 1, usage: 0.0 GB) <>
Although this output has been truncated for this post, you can see that each component has a vote count, meaning that there may not always be a need for a witness component, as is the case here. In fact, the VM Home Namespace object (first entry above) still uses a witness.
Of course, witness may still be used if vote counts don’t provide adequate quorum guarantees. However, the new methodology should reduce the component, which is one of the VSAN limits.
Cormac, thanks for these updates. I have a four node VSAN here at San Jose Unified School District. I am getting ready to prepare for a transition to 6.0 and bump the nodes to 8, then collapse an EMC Celerra / Clariion and two Cisco UCS blade servers in to it all. These articles have been a huge help.
Always nice to hear Chris – thanks.
Cormac .thanks for the hint. I believe this topic needs much more clarification. Would it be possible to show some typical VSAN designs configured with FTT=1 and FaultDomains and what happens if scenarios?
On a to-do list Thomas 🙂
But yes, I agree, some of this would be very interesting to analyze and show the different behaviors, especially with Fault Domains.
So to simplify the case scenario with a VM FTT-1/SW1 –
We have 4 components but 5 votes where:
Host A or FD A: VM1p1 (vote1)/VM1p2(vote1)
Host B or FD B: VM1p1b (vote1)
Host C or FD C: VM1p2b (vote2)
In this case, we do not need a witness with FTT=1.
If Host A is down, we have >50 vote with the 2 data components
If Host B is down, same situation with 3 components
if Host C is down, same situation.
Witness still exist in the case of the object could not span on 2n+1 host.
Thanks for adding that Julienne