Step-by-step deployment of the VSAN witness appliance

In an earlier post, I described the witness appliance in a lot of detail. Using the witness appliance is VMware’s recommended way of creating a witness host. Ideally, customers should avoid building their own bespoke appliances for this purpose. Also note that the witness appliance is not a general purpose ESXi VM. It doesn’t store/run nested VMs, and it has no VM or HA/DRS related functions. It simply functions as a VSAN witness and plays no other role. In this post, I will take you through step by step instructions on how to deploy a witness appliance for either a VSAN stretched cluster deployment or a VSAN 2-node ROBO type deployment. Unfortunately this entails a lot of screen-shots, so apologies in advance for that. However I did want this post to cover all the angles. There are 6 steps to getting the appliance deployed and configured successfully.

1. Physical ESXi network setup

Let’s begin by taking a look at the underlying physical ESXi host that we wish to deploy the witness appliance onto. You will have to ensure that the physical ESXi on the witness site is able to reach both the management network and the VSAN networks of the data sites. In my configuration, this physical ESXi on the witness site has an uplink connection to a trunk port which can access both the management network and the VSAN network, for simplicity sake. There is a portgroup created on the physical ESXi host, which uses this uplink.

1. pESXi trunk portsWhen we get around to configuring the network on the witness host, we can use this trunk port. Now it is time to start rolling out the OVA.

2. Deploy the OVA

In the interest of completeness, all of the OVA deployment steps are shown here (consider this screenshot hell part 1 if you wish). If you feel that you do not need to step through this and are comfortable with OVA deployments, please skip to part 3, the Management network configuration section. To begin the deployment, select a host or cluster and then choose “Deploy OVF Template…”. In this example, it is being deployed directly to a host:

2. deploy ovfSelect the OVA:3. select sourceReview details. Note the product name and version. The 6.1 is a reference to VSAN 6.1, which is the release of VSAN included with vSphere 6.0U1.4. review detailsAccept EULA:5. eulaProvide a  hostname and option folder location for the appliance:6. nameNext, decide the size of the appliance. There are 3 options, Tiny, Medium (default) and Large. The size should be chosen to match the number of VMs you plan to deploy:7. configsIn this case, “Medium” is chosen. Information regarding the resources required by the appliance are displayed:8. medium configChoose a location on the host to store the appliance. The witness host on my witness site has lots of storage to choose from. This may not be the case in your deployment:9. storageNext, select a port group. Note that this port group is the “trunked” port group mentioned earlier and uses a physical network adapter on the underlying ESXi host that is connected to a trunk port on the switch. This means it can reach many VLANs, including both the ESXi management VLAN and the VSAN VLAN on the data sites:10. networksProvide a password for the root user:11. passwordReview and complete:12. readyOnce the witness appliance is deployed, review the VM hardware. Compare the compute/memory/disk configuration to the tiny/medium/large option chosen during deployment and ensure you chose the correct one. Note also that the appliance has two network adapters; one for the ESXi management network and one for the VSAN network. We will configure these next.13. appliance hardware3. Management network configuration

When the OVA has been deployed, open a console to it. Since it is an ESXi in a VM, what you will see is the DCUI, hopefully familiar to you from managing physical ESXi servers. Login to the DCUI and configure the appliance so that it is reachable on the network. Provide the root password that you provide during the OVA deployment:

14. loginBy default, the witness appliance’s network has the ability to pickup a DHCP address. If there is no DHCP server, a default address is configured, as seen below. We will change this to a static IP shortly.15. config not setNote, that this appliance comes with a default hostname and DNS Suffix. We should also change this. We will do this shortly too.16. dns not setYou should first of all add a VLAN id, if it is necessary, for the management network. Since my example deployment is using a trunk port, the VLAN id needs to be added in the “VLAN (optional)” section in the DCUI. Once that this done, move to the “IPv4 configuration” and add  the relevant IP address, Subnet Mask and Default Gateway entries:19. setup mgmt nwAfter saving the IP information, the DNS information should be updated:18. dns setYou should also visit the “Custom DNS suffixes” section and remove the default “eng.vmware.com” suffix. Add the DNS suffix relevant to your environment. Once all that is completed, run the Management Network Test. Make sure that it successfully passes, including the hostname resolution.

21. test mgmt nwExit out of the DCUI.22. all okThe next steps are to add the ESXi host to vCenter server, and then complete the VSAN network setup.

4. Add the witness appliance to vCenter (as an ESXi host)

Once the management network has been configured on the witness appliance through the DCUI, it can now be added to the vCenter server as an ESXi host. This is no different to adding a physical ESXi host to vCenter. The first step is to add the hostname:

1. add hostProvide credentials. These are the same credentials added when the OVA was deployed, and used to login to the ESXi via the DCUI previously:2. pwdNote the summary information. The build is the vSphere 6.0U1 build which contains VSAN 6.1:3. summaryNote that the witness appliance comes with a license (License 1), so there is no need to add another license or consume your existing licenses:4. licLockdown mode decision is made next:5. lockLocation information:6. locationReady to complete:7. readyAnd now the witness host is added to the vCenter inventory. Note that the witness appliance ESXi is shaded blue rather than grey to differentiate it from other ESXi hosts in the vCenter inventory:8. no datastore warningNote the warning about “No datastores have been configured”. Unfortunately, there is no supported way of disabling this warning at the present time. Again, this is something we will resolve going forward.

5. Setup the VSAN network

The fifth step is to configure the VSAN network. Two pre-defined vSwitches, portgroups and VMkernel ports are pre-packaged with the appliance. Remember from my previous blog post on the witness appliance that the second network interface has been configured such that the vmnic MAC address inside of the nested ESXi host matches the network adapter MAC address of the witness appliance (the outer MAC matches the inner MAC, so to speak). This means that we do not need to use promiscuous mode.

There is no work necessary on vSwitch0/vmk0, the management network – this has been configured via the DCUI in step 3 earlier.The focus on this step is to enable VSAN traffic on the VMkernel interface vmk1, which is on the witness portgroup on the witnessSwitch vSwitch.

12. review witnesspg networkRemember that the VSAN network on the witness site will need to communicate to the VSAN network(s) on the data sites. If necessary, edit the witnessSwitch or  the witnessPg portgroup and add a VLAN for the VSAN network.  Since my environment is connected to a trunk port, I would need to add the VLAN of the VSAN network here.

13. add vlan to vsan nw (maybe)To complete the network setup, you simply tag the second interface, vmk1, for VSAN traffic and assign it some network IPV4 settings. Tag for VSAN Traffic:14. enable VSAN trafficAdd static IPv4 address information if required:15. give it ip addressAnd that is that. The appliance is now configured.

6. Add static routes as required

There is one final step, and that is ensuring that the VSAN network on the witness appliance can reach the VSAN network(s) of the data sites (and vice-versa). In ESXi, there is only one default gateway, typically associated with the management network. The storage networks, including the VSAN traffic networks, are normally isolated from the management network, meaning that there is no route to the VSAN network via the default gateway of the management network. Considering the recommendation is to route (L3 network) between the data sites and the witness site, the VSAN traffic on the data sites will not be able to reach the VSAN traffic on the witness as it will be sent out on the default gateway. So how do you overcome this?

The solution is to use static routes in the current version of VSAN. Add a static route on each of the ESXi hosts on each data site to reach the VSAN network on the witness site. Similarly add static routes on the witness host so that it can reach the VSAN network(s) on the data sites. Now when the ESXi hosts on the data sites need to reach the VSAN network on the witness host, it will not use the default gateway, but whatever path has been configured on the static route. The same will be true when the witness needs to reach the data sites.

While static routes is not the most elegant solution, it is the supported way of implementing the witness at the moment. Plans are underway to make this simpler.

Conclusion

At this point you should have a fully functional witness host. You can now go ahead and create your VSAN stretched cluster or VSAN ROBO solution. Please use the health check plugin to verify that all the health checks pass when the configuration is complete. It can also help you to locate a mis-configuration if the cluster does not form correctly. We will shortly be releasing a VSAN Stretched Cluster Guide, which will describe the network topologies and configuration in more detail. It contains deployment instructions for the VSAN witness appliance, including all the configuration steps. If considering a VSAN stretched cluster deployment (or indeed a ROBO deployment), you should most definitely review this document before starting the deployment.

16 comments
  1. Never apologize for too many screenshots – they help!
    One thing that I can’t find explicitly stated, but I’m assuming based on you choosing VNX5500 for storage, and statement about “SSD is a VMDK tagged as an SSD”. Are the storage requirements for the witness the same stringent “passthrough” as the data nodes, or can we use raid protected storage for the witness data disks?

    • Nope – there are no requirements on the underlying storage on which the witness appliance is deployed. The witness appliance itself just creates some VMDKs on this storage. One of the VMDKs is simply tagged as an SSD, so that VSAN can consume it.

  2. Having a hell of a time deploying the appliance. Plug in a password that seems to meet all the criteria, but keep getting an authentication failure after the appliance boots up. Any thoughts?

    • That’s strange. It should just work. Anyway, in case it hasn’t, the default should be ‘witness’. Of course, you should change this once logged in. Let me know if it is still not working.

      • Thanks, still no luck. I’ve tried several different password combinations as well as ‘witness’, per your suggestion, but I still can’t get logged in after boot.

          • Yup. Deployed the OVA and specified a password when prompted. After that, I boot the VM up and try the password that I specified, as well as the one you suggested, and keep getting an authentication failure. The only thing that I’m doing is typing the password in through the console of the old C# client and not through the web based console of the VM. I find it hard to believe that would be the issue, but can try that as well to eliminate that as a possibility.

          • Nah – I’d be very surprised if that is it.

            Most likely it is something to do with password complexity. The OVA is allowing a “simplified” password pass, but then the ESXi host (nested) is ‘not’ allowing it to be used. All I can suggest is use a more complicated password in the deployment, but you should also open a case with support and report it.

  3. Thanks Cormac,
    Customers are asking if using the vsan witness appliance in a standards configuration (not ROBO or STRECHED CLUSTER) is it possible to have a 2 nodes vsan cluster?
    I mean to have 2 ESXi licened for vsphere and vsan and the tirth node is just a witness appliance

    • Absolutely. 2-node is supported in standard, advanced or the special 25VM VSAN for ROBO pack. The distinction is that the two data nodes are on the one site, and not stretched across two distinct sites. In that case, you are good to go.

      We know that we haven’t communicated this very well since launch. Plans are afoot as we speak to make this as clear as possible in our collateral.

        • Let’s see if I can explain.

          With a VSAN advanced license edition, you can deploy up to a 30-node stretched VSAN with witness. 15 nodes in site 1, 15 nodes in site 2 and witness in site 3. You can deploy as many VMs as you want. But if you like, you can also deploy a 2-node stretched cluster with this license edition, 1 node in site 1, 1 node in site 2 and a witness. Not sure why you would, but you can.

          With a VSAN ROBO license, you can deploy a 2-node VSAN with the witness. You cannot deploy any larger node configurations. But you can only deploy up to 25 VMs on these two nodes.

          With a VSAN standard license edition, you can also deploy a 2-node VSAN with the witness. You cannot deploy any larger node configurations, 2 is the max. But you can deploy as many VMs on these two nodes as you wish.

          So both STD and ROBO license editions allow you to deploy a 2-node cluster with witness.

          Hope that make sense.

  4. Does it not recommend that one vCenter Server manage Stretched Cluster, witness appliance(VM) and witness host(ESXi)?

Comments are closed.