Announcing the Virtual SAN 6.0 Health Check Plugin

Today VMware announces the Virtual SAN 6.0 Health Check Plugin, a feature that will check your Virtual SAN configuration, both proactively and re-actively, and highlight any abnormal conditions found in the cluster. This is available to all our VSAN customers right now. Not only does it check the health of the cluster, but it also checks the state of the network, host connectivity, physical disk status, and underlying virtual machine object state. This is a great tool for ensuring that an initial deployment of VSAN or proof-of-concept has been rolled out successful, giving you confidence in your VSAN deployment. It…

VSAN 6.0 Part 9 – Proactive Re-balance

This is another nice new feature of Virtual SAN 6.0. It basically is a directive to VSAN to start re-balancing components belonging to virtual machine objects around all the hosts and all the disks in the cluster. Why might you want to do this? Well, it’s very simple. As VMs are deployed on the VSAN datastore, there are algorithms in place to place those components across the cluster in a balanced fashion. But what if a hosts was placed into maintenance mode, and you requested that the data on the host be evacuated prior to entering maintenance mode, and now…

VSAN 6.0 Part 8 – Fault Domains

One of the really nice new features of VSAN 6.0 is fault domains. Previously, there was very little control over where VSAN placed virtual machine components. In order to protect against something like a rack failure, you may have had to use a very high NumberOfFailuresToTolerate value, resulting in multiple copies of the VM data dispersed around the cluster. With VSAN 6.0, this is no longer a concern as hosts participating in the VSAN Cluster can be placed in different failure domains. This means that component placement will take place across failure domains and not just across hosts. Let’s look…

VSAN 6.0 Part 7 – Blinking those blinking disk LEDs

Before I begin, this isn’t really a feature of VSAN so to speak. In vSphere 6.0, you can also blink LEDs on disk drives without VSAN deployed. However, because of the scale up and scale out features in VSAN 6.0, where you can have very many disk drives and very many ESXi hosts, being able to identify a drive for replacement becomes very important. So this is obviously a useful feature. And of course I wanted to test it out, see how it works, etc. In my 4 node cluster, I started to test this feature on some disks in…

VSAN 6.0 Part 6 – Maintenance Mode Changes

There is a subtle difference in maintenance mode behaviours between VSAN version 5.5 and VSAN version 6.0. In Virtual SAN version 5.5, when a host is placed into maintenance mode with the “Ensure Accessibility” option, the host is maintenance mode continues to contribute its storage towards the VSAN datastore. In other words, any VMs that had components stored on this host still remained fully compliance with all of the components available. In VSAN 6.0, this behaviour changed. Now, when a host is placed into maintenance mode, it no longer contributes storage to the VSAN datastore, and any components that reside…

VSAN 6.0 Part 5 – new vsanSparse snapshots

There is a new snapshot format introduced in VSAN 6.0 called vsanSparse. These replace the traditional vmfsSparse format (redo logs). The vmfsSparse format was used when snapshots of VMs were taken in VSAN 5.5, and are also the format used when a snapshot is taken of a VM residing on traditional VMFS and NFS. The older vmfsSparse format left a lot to be desired when it came to performance and scalability. This KB article from our support team, indicating that no snapshot should be used for more than 72 hours, and snapshot chains should contain no more than 2-3 snapshots,…

VSAN 6.0 Part 4 – All-Flash VSAN Capacity Tier Considerations

In Virtual SAN version 6.0, VMware introduced support for an all-flash VSAN. In other words, both the caching layer and the capacity layer could be made up of flash-based devices such as SSDs.  However, the mechanism for marking some flash devices as being designated for the capacity layer, while leaving other flash devices as designated for the caching layer, is not at all intuitive at first glance. For that reason, I’ve included some steps here on how to do it.