What’s new in vSAN 6.6?

by CormacApril 11, 2017April 12, 2017

vSAN 6.6 is finally here. This sixth iteration of vSAN is the quite a significant release for many reasons, as you will read about shortly. In my opinion, this may be the vSAN release with the most amount of new features. Let’s cut straight to the chase and highlight all the features of this next version of vSAN. There is a lot to tell you about. Now might be a good time to grab yourself a cup of coffee.

Encryption

vSAN 6.6 offers DARE – Data At Rest Encryption. Yes, vSphere 6.5 also offer per VM encryption through the use of policies, but that was done at the VM layer, and if deduplication was enabled at the vSAN layer, you didn’t get the space-saving benefits from it. Encryption in vSAN 6.6 takes places at the lowest level, meaning that you can also get the benefits of dedupe and compression. vSAN encryption is enabled at the cluster level, but It is implemented at the physical disk layer, so that each disk has its own key provided by a supported Key Management Server (KMS).

This feature relies heavily on AESNI – Advanced Encryption Standard Native Instruction. This is available on all modern CPUs. There are new health checks which ensure that the KMS is still accessible, and that all the hosts in the vSAN cluster support AESNI.

A word of caution! Make sure you have your KMS protected. If you lose your keys , you lose your data. So if you are going to implement vSAN Encryption, ensure you have a good backup/restore mechanism in the event of your KMS going pop.

Local Protection in vSAN Stretched Clusters

I know this is a feature that a lot of vSAN Stretched Cluster customers have been looking for. In fact, I know of potential customers who have put off implementing vSAN stretched cluster because we did not have this feature. I’m delighted that this is now available in vSAN 6.6. In a nutshell, through policies, customers can now specify a protection level on a per site basis, as well as across sites.

There are now two protection policies; Primary level of failures to tolerate (PFTT) and Secondary level of failures to tolerate (SFTT). For stretched cluster, PFTT defines cross site protection, implemented as RAID-1. For stretched cluster, SFTT defines local site protection. SFTT can be implemented as RAID-1, RAID-5 and RAID-6. This means that even if there is a full site failure, a VM can still be protected against host or disk failure in the remaining site.

One thing to note: the witness for the cross-site protection must remain accessible, i.e. there must still be one data site and the witness site available. SFTT will not protect against the loss of a data site AND the loss of a witness.

One question you might ask is whether local site protection increases the amount of traffic that needs to be shipped over the inter-site link. The answer is no. We have implement a “Proxy Owner” feature now for each site. This means that instead of writing to all replicas in the remote site, we now do a single write to the Proxy Owner on the remote site, and this is then responsible for writing to all replicas on the remote site. Thus there is still only a single cross site write for multiple replicas.

Please note that this is not nested fault domains. In other words, you do not have control over where to place the components of an object on the data sites. All we guarantee is that we can protect the VM locally against a host or disk failure as before. We cannot do rack awareness at each site with this feature.

Secondary level of Failure To Tolerate only appears as a policy option when vSAN stretched cluster is enabled/configured.

Site Affinity in vSAN Stretched Clusters

Some customers have expressed an interest in being able to deploy VMs on a vSAN stretched cluster with FTT=0, in other words, do not tolerate any failures. This has typically been for applications that have a backup running elsewhere, or has the ability to replicate internally at the application level, e.g. SQL Server AlwaysOn. Customers can now use the Affinity policy to request that a particular VM get assigned to a particular site, from a storage perspective. This is akin to specifying data locality for a particular VM. This policy is only applicable when the primary level of failures to tolerate is set to 0.

Customers need to ensure that DRS/HA rules should align to Data Locality. Customers should pin the VM’s compute to the site where the VMDK resides via affinity groups. There is no automatic way to determine this at present. Affinity only appears as a policy option when vSAN stretched cluster is enabled/configured.

Unicast Mode

Yes – we have finally removed our reliance on multicast. No more IGMP snooping, or PIM for routing multicast traffic. In vSAN 6.6, all hosts will now talk unicast, and the vCenter server becomes the source of truth for cluster membership. If you are upgrading from a previous version of vSAN, vSAN will automatically switch to unicast once all hosts have been upgraded to vSAN 6.6.

Of course, with all these things there are caveats. For example, if the on-disk format has not been upgraded to the latest version 5, and a pre-vSAN 6.6 host is added to the cluster, then the cluster reverts to multicast.

On the other hand, if you have upgraded to on-disk format v5, and then add a pre-vSAN 6.6 host to the cluster, the cluster will continue to talk unicast, but this newly added host can only talk multicast so it will be partitioned. If the cluster is at 6.6, and the on-disk format is v5, don’t add any pre-vSAN 6.6 hosts to the cluster.

This removal of multicast as a requirement will definitely make vSAN deployments much easier from a networking requirements perspective.

By the way, if you run an esxcli vsan network list, multicast information will still be displayed even though it may not be used. The following new esxcli command will tell which hosts are using unicast (it does not list the host where the command is being run from however):

[root@esxi-dell-i:~] esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address      Port  Iface Name
------------------------------------  ---------  ----------------  -------------  -----  ----------
58d8ef12-bda6-e864-9400-246e962c23f0          0              true  172.200.0.123  12321
58d8ef50-0927-a55a-3678-246e962f48f8          0              true  172.200.0.122  12321
58d8ef61-a37d-4590-db1d-246e962f4978          0              true  172.200.0.124  12321

Here is the same command run from a vSAN stretched cluster (note the witness):

[root@esxi-dell-j:~] esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address     Port  Iface Name
------------------------------------  ---------  ----------------  ------------  -----  ----------
58d29c9e-e01d-eeea-ac6b-246e962f4ab0          0              true  172.4.0.121   12321
58d8ef12-bda6-e864-9400-246e962c23f0          0              true  172.3.0.123   12321
58d8ef61-a37d-4590-db1d-246e962f4978          0              true  172.3.0.124   12321
00000000-0000-0000-0000-000000000000          1              true  147.80.0.222  12321
[root@esxi-dell-j:~]

More visibility into objects via UI

This is something close to my heart. In previous releases of vSAN, we’ve only be able to see the VM Home and VMDK objects represented in the UI. Items like the VM Swap object could only be observed from the RVC. In this vSAN release, we have now the ability to query the status of other objects such as the VM Swap, as shown below.

Also note here that the VMDK that have the VM name associated with them (e.g. Hard disk 9 -vcsa-06.rainpole.com_8.vmdk) are snapshot delta objects. These are now also view-able now from within the UI.

Smart Rebuilds

In previous releases, vSAN never re-used the original component after a rebuild operation to a new component had started. So if a host is absent for more than 60 minutes, the components on that host are never re-used even if the host comes back after 60 minutes.

vSAN introduces a new smart rebuild behavior. If the absent components comes back online, even after 60 minutes, vSAN compares the cost of re-using the old components versus the cost of continuing to resync the new components. vSAN then chooses the one with the lower cost and cancels the resync for the other one. This feature saves a lot of unnecessary resyncing and temporary space usage when a host goes absent and comes back after 60 minutes.

There is another enhancement to the rebuild mechanism in vSAN 6.6. The rebalancing protocol has been updated to accommodate previous vSAN limitations. One issue was an inability to break large components into smaller chunks.  This meant that rebalancing of large components sometimes led to an inability to rebalance due to space constraints (i.e. no place to rebuild a large component). In vSAN 6.6, rebalancing can now break large components into smaller chunks when physical disk capacity is at greater than 80% utilization.

I did a test of this feature in my lab. In one example, I noticed a large 232.1 GB component split into a concatenation of 2 x 116.1 GB components:

Original:

Component: bbcbe458-094f-18d6-5b90-246e962f4978 (state: ACTIVE (5), host: esxi-dell-l.rainpole.com, 
md: naa.500a07510f86d69f, ssd: naa.55cd2e404c31f9a9,votes: 1, usage: 232.1 GB, proxy component: false)

Example 1:

 Concatenation
Component: 79ece458-dfff-d828-5d74-246e962f4ab0 (state: ACTIVE (5), host: esxi-dell-j.rainpole.com, 
md: naa.500a07510f86d6c7, ssd: naa.55cd2e404c31f8f0,votes: 1, usage: 116.1 GB, proxy component: false)
Component: 79ece458-c790-db28-fead-246e962f4ab0 (state: ACTIVE (5), host: esxi-dell-j.rainpole.com, 
md: naa.500a07510f86d684, ssd: naa.55cd2e404c31f8f0, votes: 1, usage: 116.1 GB, proxy component: false)

On another occasion, as disk became more and more scarce, the component was split into a concatenation of 8 x 29 GB chunks.

Example 2:

 Concatenation 
Component: a10be558-6fae-0105-403c-246e962f48f8 (state: ACTIVE (5), host: esxi-dell-i.rainpole.com, 
md: naa.500a07510f86d6c5, ssd: naa.55cd2e404c31ef8d, votes: 1, usage: 29.0 GB, proxy component: false) 
Component: a10be558-e141-0405-17b2-246e962f48f8 (state: ACTIVE (5), host: esxi-dell-i.rainpole.com, 
md: naa.500a07510f86d6c5, ssd: naa.55cd2e404c31ef8d, votes: 1, usage: 29.0 GB, proxy component: false) 
Component: a10be558-8564-0505-f136-246e962f48f8 (state: ACTIVE (5), host: esxi-dell-k.rainpole.com, 
md: naa.500a07510f86d69c, ssd: naa.55cd2e404c31e2c7, votes: 1, usage: 29.0 GB, proxy component: false) 
Component: a10be558-67c1-0605-8f2c-246e962f48f8 (state: ACTIVE (5), host: esxi-dell-k.rainpole.com, 
md: naa.500a07510f86d6b8, ssd: naa.55cd2e404c31e2c7, votes: 1, usage: 29.0 GB, proxy component: false) 
.
.
.

Partial Repairs

If there are objects with components that are degraded or absent for more than 60 minutes, vSAN will try to repair all the components of that object to make the object completely compliant once more. In previous releases, if there are not enough available resources to repair all the impacted components of an object, vSAN simply did not try a repair attempt.

In vSAN 6.6, vSAN now tries to repair as many impacted components as possible, even if it can’t repair all the components of an object. This is important in scenarios where partially repairing the components in an object can allow vSAN to tolerate additional failures to the object, even though there are not enough resources in the cluster to make the object fully compliant.

Here is a scenario from a stretched cluster using local site protection to explain this concept a little better. If one site is down, then all VMs are running on the remaining site. If there is another failure in the remaining site, vSAN will now repair as many components as possible but may not be able to repair all the components in an object.

12 node cluster (6+6+1)
6 nodes remaining after a site failure
Secondary FTT=2
Secondary FTM=RAID-1
Requires 5 hosts – 3 copies of the data, 2 witnesses (2n + 1)

If the remaining site now suffers 2 additional host failure, vSAN now rebuilds data copies on the 4 remaining hosts. However, vSAN can only repair then up to (FailuresToTolerate) FTT=1 due to a lack of remaining hosts. To implement FTT=2, we need 2n+1 hosts (5) but there are only 4 remaining on this site. Therefore FTT=1 is the most that vSAN can repair.

Partial repair is only done if it results in a higher FTT. Data components are always repaired before witness components (we prioritize data components.

Resync Throttling

Another major enhancement around resync is to give the end-user control over resync activity from both the UI and the CLI. This has been something many customers have been asking for. While under normal circumstances we would strongly recommend that the resync throttling should be left disabled, there may be occasions where you wish to throttle it up or down. The screenshot below shows where this can be done.

Finally, in the past, if a resync process was interrupted, the resync may need to start all over again. Now in vSAN 6.6, resync activity will resume from where it left off (if interrupted) by using a new resync bitmap to track changes. A very desirable improvement indeed.

New pre-checks for maintenance mode

vSAN now does a capacity pre-check before allowing a host enter maintenance mode. This will prevent scenarios where putting a host into maintenance mode may put capacity constraints on a cluster.

As you can see, this details how much space is needed to do a full data evacuation, or an ensure data accessibility option. If you choose to do not data evacuation, it will tell you how many objects are impacted.

New pre-checks for disk group, disk removal

Similar to maintenance mode, there are now pre-checks if you wish to decommission a disk group or a disk. By default, it selects to evacuate ALL data, but by changing the option to ensure data accessibility or no data evacuation, you are presented with an impact statement.

This can be a very useful visual aid when replacing or upgrading devices. A similar view is available for disks.

Easy Install and Config Assist

There is two parts to this feature. The first is the ability to deploy vSAN more easily. The net result is that the installer will guide you through a successful deployment of a vCenter Server to a single ESXi host that is configured with a vSAN datatore. This will take care of the networking setup, the claiming of local storage devices, the creation of the vSAN datastore and so on. Of course, you will need to grow your cluster to the minimum of 3 nodes afterwards, but this is just a matter if dropping the hosts into the cluster. And just in case you needed guidance after the cluster is set up, there is another new tool called “Config Asst” to help you verity that your vSAN cluster is configured correctly. It will highlight items such as DRS not configured, or vSphere HA not configured, along with a bunch of other items such as network configuration and hardware compatibility. Below is an example of some of the tests. This needs a lot more detail around these features, so I will follow-up with another blog post in due course.

Online/Cloud health

This is something I am very passionate about; proactive notifications about potential issues as well as the ability to provide prescriptive guidance on what to do when something does got wrong. In vSAN 6.6, we are taking our first steps towards using analytics to provide you with this information in real-time. Once you enable CEIP (Customer Experience Improvement Program), information about your vSAN system is sent back to us here at VMware. We can then use this information to provide additional online health checks which are specific to your environment. There is a lot more to say about this, so I will follow-up with a more thorough blog post at a future date.

HTML5 Host Client Integration

For those of you who have started to use the new HTML5 host client, there are now a number of vSAN workflows included in the new client. One of the nicer enhancements is the health feature, which allows you to get a look at the overall health of the vSAN cluster from a single host client:

There are also options for enabling data services, such as deduplication, etc, via the new HTML5 host client.

New esxcli commands

I already showed you the new unicastagent esxcli command earlier. However, a new esxcli command to assist with troubleshooting has also been added.

esxcli vsan debug
 Usage: esxcli vsan debug {cmd} [cmd options]

Available Namespaces:
 disk Debug commands for vSAN physical disks
 object Debug commands for vSAN objects
 resync Debug commands for vSAN resyncing objects
 controller Debug commands for vSAN disk controllers
 limit Debug commands for vSAN limits
 vmdk Debug commands for vSAN VMDKs

As well as the esxcli vsan debug command, we also added the following commands in vSAN 6.6 information to get troubleshooting information:

• esxcli vsan health cluster
 • esxcli vsan resync bandwidth
 • esxcli vsan resync throttle

Example 1:
 Use "vsan debug vmdk" command to check all of VMDKs status:
 
 [root@esxi-dell-j:~] esxcli vsan debug disk list
 UUID: 52bc5813-b8a5-b004-60cd-82cf6cda6426
    Name: naa.500a07510f86d6c7
    SSD: True
    Overall Health: green
    Congestion Health:
          State: green
          Congestion Value: 0
          Congestion Area: none
    In Cmmds: true
    In Vsi: true
    Metadata Health: green
    Operational Health: green
    Space Health:
          State: green
          Capacity: 800155762688 bytes
          Used: 20883439616 bytes
          Reserved: 3187671040 bytes