What’s in the vSphere and vSAN 6.7 release?

Today VMware unveils vSphere version 6.7, which also includes a new version of vSAN. In this post, I am going to highlight some of the big-ticket items that are in vSphere 6.7 from a core storage perspective, and also some of the new feature that you will find in vSAN 6.7. I’ll also cover some of the new enhancements coming in Virtual Volumes (VVols).

vSphere 6.7 Core Storage features

HTML 5 Client

We have now ported all of our storage workflows to the new H5 client. This is true for vSphere core storage, vSAN, VVols, SPBM, etc. So I would strongly recommend switching over to the new H5 client and familiarizing yourself with it as much as possible. Even in this 6.7 release, there are certain workflows that are only available in the H5 client, and going forward, this will be the only client that will support new functionality.

New Limit Increases

Let’s begin by discussing the increase in limits for devices and paths. In vSphere 6.5, we increased the number of device paths from 1024 to 2000 per ESXi host. We also increased the number of devices from 256 devices per host to 512 devices per hosts (not via the same target, but via multiple targets). In vSphere 6.7, we are increasing these limits once again. We are now going to support 4096 paths per ESXi host and bump the number of devices supported per host from 512 to 1024.

4K Native Device Support

There is now support for 4K native devices (4KN). These are devices that use a 4KB sector size rather than the traditional 512 byte sector size. The advantage of course is that we can now have much larger capacity disk devices on vSphere. In vSphere 6.5 we introduced VMFS-6. This new version of VMFS-6 is already designed to handle these new sector sizes. But what about your legacy applications running in the Guest OS? Can these handle 4K sectors? You may also remember that in vSphere 6.5, we announced support for 512e devices. These were 4K devices which could emulate 512 sector devices, thus the “e” in 512e. What we have done is taken this hardware “emulation”  and moved it out of the drives and up into our I/O stack. Thus, as far as the Guest OS are concerned, they are still dealing with 512 sector drives thanks to this software emulation.

UNMAP Enhancements

vSphere 6.5 saw the introduction of Automated UNMAP. This was run at a very low priority and only reclaimed at a very conservative rate of 25MBps. In vSphere 6.7, we are now going to allow end-users to tune this reclaim rate with a new reclaim method called “Fixed”. This new Space Reclaim Setting is available on VMFS-6 volumes and allows you to adjust the reclaim rate. Simply select the volume from the HTML5 client UI, go to Configure > General and on the Space Reclamation Settings, click Edit. This will allow you to tune the Space Reclamation rate from 100MBps to 2GBps. Changing the reclamation rate automatically changes your reclaim method from the older “Priority” method to the newer “Fixed” method. Priority uses the 25MBps reclaim rate. You may want to get some advice from your storage array vendor on the best setting for your particular array. However, it is based on feedback from many array vendors, especially the flash array vendors, that this enhancements was introduced.

VAAI XCOPY Enhancements

When offloading certain tasks to the storage array via the VAAI primitives, one of the common tasks offloaded is a clone task. ESXi asks the array to copy blocks from location A to location B instead of doing this work on the hypervisor. This request uses the XCOPY command. By default, the Maximum Transfer Size of an XCOPY ranges between 4MB-16MB. However EMC VMAX arrays have been able to ask the ESXi host to tune this to a higher value, up to 200MB if I remember correctly. In vSphere 6.7, through the use of PSA claim-rules, we are going to extend this functionality to additional storage arrays, and again, if I remember correctly, these are the DELL-EMC XtremIO, VNX and Unity arrays. I don’t know what the recommended values will be for these arrays, but I suspect there will be some product documentation and guidance from DELL-EMC in the not too distant future.

Summary

There are some nice core storage enhancements in this release, I’m sure you will agree. I haven’t covered everything, but there should be a wealth of updated documentation and blog posts which will delve even deeper into what we have done in the core storage space in vSphere 6.7. One last item however – this release sees the end of life (EOL) for VMFS-3. So now is a great time to move to 6.7 and VMFS-6.

 

vSAN 6.7  features

Let’s move onto what is new in vSAN 6.7. Many of the items listed above are also true for vSAN, such as full support in the HTML5 client, as well as 4KN device support. Let’s focus on some of the other features and enhancements.

Integrated with vRealize Operations

vRealize Operations 6.7 now provides a global operations view of vSAN 6.7 environments with six new dashboards embedded within vCenter Server 6.7. Probably the coolest thing about this capability is that it does not require a separate vRealize Operations license and is available to anyone with a vSAN Advanced or vSAN Enterprise license.

vSAN iSCSI support for Windows Server Failover Cluster (WSFC)

This is a feature many customers have been asking about. A number of releases back, we supported the presentation of vSAN VMDK objects as iSCSI LUNs via an iSCSI target. This could then be consumed by any host (physical or virtual) that had an iSCSI initiator. We have now extended this functionality to support these iSCSI devices being presented to a WSFC configuration. In the event of a vSAN node failure, or indeed a vSAN node being placed into maintenance mode, the I/O owner of the iSCSI LUN can be moved to alternative node in the vSAN cluster and I/O can continue to flow between the WSFC nodes and the iSCSI device. Once again, we can support this on physical and virtual hosts that have iSCSI initiators.

QoS on Resync Traffic

Resync is an operation that takes place for a number of reasons on vSAN, be they remedial or maintenance related. For example, changing a policy from a RAID-1 to a RAID-5 will instigate the creation of a new object layout, and then all the data will need to be synced between the original object and the new object before the original object is deleted. In the past, huge amounts of resync traffic could impact the performance of the VM IO. With this release, we are building on top of existing enhancements in this area to provide additional Quality of Service (QoS) around network traffic and VM IO. Now if contention arises, we can throttle resync traffic down to 20% of the network bandwidth, allowing VM IO to consume 80% of the bandwidth. Of course, if there is no resync, VM IO can consume 100% of the bandwidth. We think that this new QoS mechanism will resolve contention issues seen in the pass between VM IO and resync traffic.

Improvements to VM Swap

The VM Swap object on vSAN has always had some unexpected characteristics. In the first place, it did not adhere to the policy settings associated with the VM, but instead always used the default policy settings. Also, the VM Swap was also provisioned thick unless you set an advanced parameter which we bubbled up in vSAN 6.2 to make it thin. In this release, we changed both of these behaviors. The first is that the VM Swap object now inherits the policy settings assigned to the VM Namespace, and it is now always provisioned thin by default.

Witness Traffic Separation for Stretched Cluster

We have had Witness Traffic Separation (WTS) functionality for 2-node vSAN deployments for some time, but in this release we can support WTS with Stretched Clusters as well. The idea is to separate the vSAN traffic from the witness traffic, and when the witness appliance is deployed is at a remote/third site, then you only need to route or stretch this traffic to the third site rather than all of the vSAN traffic. For example, if the vCenter Server and management components were at the third site along with witness appliance,  the witness traffic could be placed on the management network while leaving the vSAN traffic on the vSAN network between the two data sites.

Fast failover on redundant vSAN networks

This relates to having vSAN nodes configured with multiple vSAN networks. If one network fails, then the nodes could communicate on the other network. Prior to this release, there was no way of “fast-failing” vSAN network connections to initiate a quick failover. We basically had to wait for TCP time-out to occur before using the alternative vSAN network. This took minutes to happen. In vSAN 6.7, we now have a fast fail mechanism which means that failover to an alternate network can now happen in seconds. I will caveat this with the point that if the two vSAN networks are isolated, and you have individual NIC failures on one host, then that host will be isolated. If there is a route between the vSAN networks, then you are “good to go”. To avoid this situation with single NIC failures, one might consider a NIC team for both vSAN networks. Of course, if a whole network goes down (e.g. switch failure) and all hosts move to using the secondary vSAN network, then you are “good to go” as well. We continue to improve this area as we work towards full air-gap support of vSAN networks.

Support for Shared Nothing architectures e.g Big Data (by request only)

This is a discussion that is becoming more and more common when speaking with our vSAN customers. They would like to be able to run shared nothing architectures, such as Hadoop, on vSAN. However, to meet the requirements of shared nothing architectures, there needs to be some way of keeping both the compute and storage on the same host for redundancy purposes. In many cases, the application itself has replication built-in – Hadoop HDFS being a common example. So how can vSAN provide this host pinning of data, and indeed how does it benefit these shared nothing architectures. Let’s take the Hadoop example.

We can think of the Hadoop filesystem, HDFS, as needing two different components by default – datanodes and namenodes. Datanodes are nodes/VMs that will hold the actual blocks of data, namenodes will hold the metadata (file info, block locations) in a filename called fsimage. When we deploy the datanode components, vSAN does not need to not offer any availability. The HDFS has its own replication factor built-in (default:3) meaning that all blocks are replicated across the datanodes. Thus we can use an FTT=0 policy on vSAN for these VMs. Thru vSphere, we can help with placement and affinity/anti-affinity with other datanode in a vSphere environment, as we do not want 2 datanode VMs on the same ESXi host.

This brings us to the namenode which hold the HDFS metadata/lookup. You may also have a secondary namenode, but note this is not a failover namenode. The secondary only does admin tasks on behalf of the primary namenode. These namenodes are not protected at the storage level in any way (there is no built-in replication factor, etc). So here vSAN can offer availability at the storage level with FTT/RAID levels, etc.

If you do wish to make your Hadoop namenodes highly available, in other words, not have your deployment rely on a single namenode, this introduces another set of Hadoop components. You will now have an active and standby namenode (not a secondary), but you also end up with a set of journal nodes for tracking namenode transactions. If the active namenode fails, the journal can be replayed against the passive namenode. Again, these are not protected at the storage level, so vSAN can again provide FTT/RAID protection for the journal nodes. So to recap, vSAN FTT=0 can be used with datanodes, but we can certainly use higher FTT with namenodes and journal nodes to make Hadoop on vSAN highly available from a storage perspective, and vSphere can help with VM compute placement and affinity/anti-affinity.

And how to we make sure that the compute and VMDKs for the datanodes are kept local on the same vSAN host? We do this by introducing a new policy for data locality in 6.7. Note that this is not yet freely available, but is only available via the RPQ process. if this is something you would be interested in, reach out to your local VMware account team who can put you in touch with our vSAN Product Managers.

New On-Disk Format Version 6

You will notice that there is a new on-disk format version with this release. While there are no specific features in vSAN 6.7 that rely on this on-disk format, we do recommend that you upgrade to version 6 to future-proof your vSAN environment for some forthcoming features (which I can’t yet discuss) coming down the line. Note that there is no data move involved in upgrading the on-disk format to version 6 in the 6.7 release of vSAN.

FIPS 140-2 validation for vSAN Encryption

vSAN Encryption in vSAN 6.7 now meets strict U.S. Federal government security requirements with FIPS 140-2 validation. This is a standard used to approve cryptographic modules.

Summary

A lot of nice new features in this release once again. We are improving our workflows which we have the ability to enhance in the new H5 client. We are also looking at how we can provide even more resilience, especially in the area of networking. And finally we are looking at new use cases, such as shared nothing architectures. Again, this is not a comprehensive list of all of the enhancements in the 6.7 release. I have highlighted only some of the items. Check out the official docs and blogs for a complete list.

 

Virtual Volume Enhancements in vSphere 6.7

VVol support for Windows Server Failover Cluster (WSFC)

There is one major enhancement to VVols in this release and that is support for WSFC, Windows Server Failover Cluster on VVols. We have always relied on Raw Device Mappings (RDMs) for WSFC, as we had the ability to place SCSI-3 Persistent Group Reservations (PGRs) on the RDM. In vSphere 6.7, we can now place SCSI-3 PGRs on a virtual volume. This means that we can also support WSFC. Great news.

VASA Provider Enhancements

We now defaulted to TLS v1.2, the Transport Layer Security protocol in vSphere 6.7. This will mean that the VASA Provider provided by your VVol vendor has to support TLS v1.2 as well. If it does not, there is an alternative way of ensuring you can continue to communicate with your VASA Provider from vSphere 6.7. This method involved modifying the versions of TLS that vSphere 6.7 supports and enabling it to support earlier versions. My understanding is that this will be a documented procedure, probably in the form of a KB.

Also in 6.7, there is full support for connecting to the VASA Provider over IPv6.

Summary

VVol enhancements continue apace. The support for WSFC is certainly one that I know a lot of customers have been asking for. Watch out for the TLS change though – I think that may catch some people out.

19 comments
  1. Cormac,

    Regarding the disk format upgrade, I see here (and elsewhere) that the upgrade does not involve moving data. Is this to mean it will be a similar upgrade as was 3.0 to 5.0, where the disk groups were not removed?

  2. Hello Cormac, my name es Juan from Spain and I have a question about Witness traffic on stretched cluster. Is necessary the communication between vSAN Host ESXi and ESXi Witness?
    Thanks in advanced.
    Kind regards.

    • Yes – there has to be some communication between the ESXi host/vSAN node and the witness appliance.
      Up until 6.7, all of the vSAN network (which includes the witness traffic) had to be routed to the third site where the witness appliance resides.
      In 6.7, we can separate the witness traffic from the vSAN traffic.
      This means that the witness traffic can be routed to the third site where the witness appliance resides, and the vSAN traffic only needs to flow between the 2 data sites.

  3. Thanks for the awesome news Cormac! What about RMDA with RoCE usage for high performance vSAN All-Flash Design with NVMe Cache Drives? I´m looking for some documentation or whitepaper for such deployments.

  4. Thanks for the quick reply Cormac, but I already found this article. I can´t find examples, whitepapers or some documentation from VMware about vSAN Deployments with RDMA (RoCE). I always found some statements such like “RDMA can be useful to accelerate many of the hypervisor services including; vMotion, SMP-FT, vSAN, NFS and iSCSI” from – https://storagehub.vmware.com/t/vsphere-storage/vsphere-6-7-core-storage-1/rdma-support-in-vsphere/ but I can´t find some practical examples. I´m currently plan to deploy a vSAN with NVMe´s and RDMA (RoCE) with Mellanox for one of our costumer. He comes from a bad sized MS S2D deployment and don´t want to offer him a non prove and not supported solution.

  5. Hi Comarc,

    is it in 6.7 now possible in an NVMe only vSAN Cluster to renounce to Cache Tier? It makes no sense…

    • No Michael, it is not. vSAN is built as a caching system, so there is still a need for a cache tier and capacity tier in the current architecture.

      However, we are looking at ways to re-architecture vSAN where there will no longer be a need for both cache and capacity.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.