vSphere 6.0 Storage Features Part 3: MSCS Improvements

icon_clusterOK – not storage improvements per-se, but I got into the habit of documenting our Microsoft Clustering Services (MSCS) improvements some time back, and habits die-hard. Many of our customers continue to run Microsoft Clustering Services (MSCS) on top of vSphere. This is well-recognized, and VMware continues to improve and add features around this for our customers. vSphere 6.0 is no different, with a selection of improved functionality around MSCS on vSphere.

1. vMotion of MSCS nodes using pt-RDMs

Yes, it’s finally here. The ability to vMotion virtual machines with pass-thru raw device mappings (RDMs), and that includes virtual machines that are being used as nodes in MSCS. (Pass-thru mode is also referred to as physical compatibility mode). Of course, MSCS is about the only thing that uses RDMs at this stage, and we would recommend that you avoid using them if at all possible. But since MSCS still places a SCSI reservation on the actual disk as part of its tie-breaking procedure when a failure occurs, the use of RDMs is still necessary so that the SCSI reservation can go all the way down to the disk (and not be converted to a file lock, which is what happens with VMDKs and non-pass-thru RDMs). Lot of changes were made to the various storage layers in the VMkernel to checkpoint/restore SCSI-3 reservations during a vMotion. Awesome stuff!

2. MSCS and IPV6 inter-op

VMware has done a lot of work in vSphere 6.0 around IPv6. We can now deploy MSCS nodes running in VMs with IPv6.

3. PVSCSI support for MSCS

This has been a request I’ve heard a number of times. PVSCSI, or ParaVirtual SCSI Bus Adapters can provide even more performance than other virtual SCSI bus adapters in certain conditions/workloads. PVSCSI adapters are now supported as a virtual adapter for virtual machines that are configured as MSCS nodes.

4. Guest OS and Application Support

VMware now supports the following guest OS and applications with MSCS in vSphere 6.0:

  • Win 2012 R2
  • SQL Server 2012
  • SQL 2012 AAG

5. Protect vCenter Server 6.0 with MSCS

I read this on a recent vSphere blog post. It seems that, amidst the initial confusion, vCenter Server 6.0 (and 5.5 U3) will now support the clustering of the vCenter Server using MSCS, in addition to the back-end database. It seems that there will be more information coming soon, and I’d recommend checking the vSphere blog regularly for updates.

A nice set of improvements in vSphere 6.0 around MSCS.

10 Replies to “vSphere 6.0 Storage Features Part 3: MSCS Improvements”

  1. I thought Windows 2012, SQL 2012 and SQL AAG’s were already supported in vSphere 5.5 let alone 6? Did i misread something here? Specifically on this link SQL AAG (bottom row) show as supported?

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959

    That said, I still find with vSphere 5.5 that SQL AAG’s and Exchange will failover during vmotions (storage or host) and long running snapshots. Increasing the cluster heartbeat helps, but doesn’t eliminate it entirely. I would really love to see VMware do a formal write up on optimizing a VM guest for vmotion. The heartbeats are one thing, but is there anything else we can do?

    1. EricSinger,

      Snapshots stunning VM’s on release is common for high transactional VM’s (its a lot to flush on the redoo log). We’ve worked around this a couple of ways.

      1. Set the Database and Log VMDK’s to Inexpedient persistent and do in guest backups (To another VMDK drive) and just backup the C:\ and that drive in your backups.

      2. Quit snapping the VM’s (vSphere Replication does write splits for replication, and use in guest agents for VDPA etc).

      3. (Soon) switch to VSAN 6 with the new snapshot system based on MetaData pointers and a log structured file system. Instead of 50-80% hit’s for snapshots, we are looking at 3% IO performance hits with VSAN for snapshots. Talking to some customers, I think we will see a move for BCA (SQL specifically) moving to VSAN6 (hopefully some all flash configs) to work around this existing problem.

      1. Thanks for the reply:

        1. I get that snaps stun a VM, and its more noticeable for high transaction VM’s. I was just wondering if there was any improvements in the native VMware snap process that makes it more usable for MSCS.

        As for the recommendations, just a few comments:

        Making disks independent would not help me and many other in the case of backup’s. I use Veeam which uses snaps as the only backup method. Setting the disks to independent would mean I wouldn’t get the most critical part of my data. Using application agents is fine for those lucky enough to afford the licensing schemes that come from the likes of CommVault and NetBackup.

        As for vSphere Replication, I’m not replicating the VM’s (that’s the purpose of SQL AAG’s, application level replication > than VM level replication). My issue is with backup, which requires a VMware snap.

        Don’t get me started on vSAN. Let’s just say if that’s the best VMware can do, they can keep it.

        Back on subject, I honestly think VMware should remove SQL AAG, and Exchange CCR/DAG’s from their support matrix because honestly if they can’t guarantee it won’t fail over during any of those actions, then it shouldn’t have a check box next to it. I’ve increased the cluster time outs, the disk time out’s and it still occasionally fails over. I have a 20 vCPU / 256GB VM, and it both slow to a crawl and occasionally fail over if I try to vMotion it during the day. I’m a very big proponent of VMware and all its features, but its clear to me that MS’s clustering doesn’t play well with vMotion or snpashots.

    1. Is this not already available? To be honest, I don’t see many CIB scenarios so I’m not up to speed on what you can and cannot do with it.

        1. David, I’d argue for Cluster in a Box FT is actually more useful in vSphere 6 (supports 4 cores, and doesn’t require shared storage as you actually store 2 copies of the VM now). Saves you a lot on Microsoft licensing. Only downside is you’ll have to take downtime to patch (but less to manage).

          1. Hi, Yes FT is definatly more useful however, It is not good enough for our use cases for some of our mission critical systems that can only get a few minutes downtime per month. As there will always be downtime during patching, and in this case HA must be offered at the application layer.

            I found some info however that you can now also vMotion VMDK’s with physical bus sharing, but its not supported. So probably not a good idea to do this 🙂 http://blogs.vmware.com/apps/2015/02/say-hello-vmotion-compatible-shared-disks-windows-clustering-vsphere.html

Comments are closed.