Today VMware announces the Virtual SAN 6.0 Health Check Plugin, a feature that will check your Virtual SAN configuration, both proactively and re-actively, and highlight any abnormal conditions found in the cluster. This is available to all our VSAN customers right now. Not only does it check the health of the cluster, but it also checks the state of the network, host connectivity, physical disk status, and underlying virtual machine object state. This is a great tool for ensuring that an initial deployment of VSAN or proof-of-concept has been rolled out successful, giving you confidence in your VSAN deployment. It is also useful for ongoing monitoring and maintenance of your Virtual SAN cluster.
Simple Installation without any downtime
The feature has two components; a plugin to vCenter and a VIB for the ESXi hosts. It works on both the Windows and appliance versions of vCenter Server. VMware provides an MSI and RPM for each version of vCenter. Once installed, there is a new health view for Virtual SAN in the web client. Initially the health service is disabled, but can be enabled with the click of a button on the UI. Once the health service is enabled via the web client, a VIB is pushed out to all of the ESXi hosts in the cluster (using a rolling upgrade type mechanism). This means that the installation can be done without any downtime.
When all hosts have the VIB installed, (which does involve the host entering maintenance mode and rebooting), the health checks can now be run against the cluster. This installation process can also be done via RVC, the Ruby vSphere Console. A new health section is now available under the Monitor tab > Virtual SAN view, along with the 30 or so individual health checks:
One of the most difficult issues for VSAN customers is ensuring that their hardware configuration is supported. This entailed some tedious time spent checking the VMware Compatibility Guide. With the new VSAN health check plugin, the underlying hardware of the VSAN hosts, storage controllers and even driver versions can automatically be checked against the HCL. If the vCenter server has access to the Internet, a HCL database file may be retrieved automatically from VMware. Alternatively, a HCL database file can be downloaded from VMware.com, transferred to the vCenter server, and uploaded manually. This gives peace of mind that the underlying configuration is indeed supported for VSAN at the click of a button.
Easy troubleshooting via Ask VMware
Every individual health check is mapped to a KB article via “Ask VMware”. Therefore if any of the tests or checks report a warning or error, the admin can simply click the “Ask VMware” button associated with the test and he/she will be taken straight to the VMware knowledge base site and an article describing the test, why it might have failed and what can be done to remedy the situation is provided. Here is an example of a check which failed because not all of the advanced settings used by VSAN were in-sync across all the hosts in the VSAN cluster. Details about the nature of the failure are provided, as well as values and hosts that mismatch. By clicking on the “Ask VMware” button, you will be directed to a KB article giving you more information about the failed health check:
Not only does the health check plugin highlight issues re-actively (after they have occurred), but it also contains a set of proactive tests. These should not be run in production, but are a way of checking the integrity of the cluster during pre-production. These tests will selectively check the successful deployment of virtual machines on the VSAN datastore. It can also check the bandwidth of the multicast network between the various hosts in the cluster, verifying that it is acceptable for VSAN traffic. And probably most useful of all is the inclusion of a storage performance test, offering admin a range of different workloads to work with. The duration of a particular test may also be chosen before it is run, offering a burn-in test of sorts for VSAN’s hardware components. Here is an example of one such test, and its results.
If you still need to engage our support staff for an issue, the VSAN health check plugin now has an integrated support assistant. Once you have your Service Request (SR) opened, logs can be uploaded through the Support Assistant on the main VSAN UI page and associated directly to your SR. A real time-saver.
Optional Customer Experience Improvement Program (CEIP)
The health check plugin also allows you to participate in VMware’s “Customer Experience Improvement Program” and provides information about how you use your environment, etc. This is completely optional, and can be enabled or disabled at any time via the main VSAN UI page. The UI provides details on where further information about CEIP can be found.
Sounds great. Where do I begin?
Start with downloading the VSAN 6.0 Health Services Plugin guide. This has all the details you need about where to get the software, how to install it, and details on all of the individual checks that the health plugin carries out. It also provides details about all the other features that are not covered in this blog post. What are you waiting for? Go get it now!