Using Host Groups with Availability Zones (AZs) in Enterprise PKS

After being asked about how vSphere Host Groups worked with Availability Zones in Enterprise PKS earlier this week, I decided to spend a little time setting it up in my lab and doing some testing to make sure I could understand the feature and its behaviour. Essentially what this feature allows you to do is to make use of the vSphere Host Group feature to group a bunch of ESXi hosts together. Then as one builds Availability Zones (commonly referred to AZs) in Enterprise PKS, a Host Group can be associated with an AZ. Anything that Enterprise PKS deploys to this AZ is then automatically placed on the ESXi hosts in that Host Group.

In my testing, I first of all created a single Host Group with a selection of ESXi hosts added to it. Then when I built the AZ in PKS, I associated the Host Group with it. This meant that my Enterprise PKS deployment placed the BOSH VM, PKS VM, my Kubernetes cluster master and worker nodes on the same set of ESXi hosts that were defined in the Host Group. This is what the Host Group (and subsequent VM Group) looked like after my first K8s cluster was deployed (the VM group is automatically created):

OK – so that was pretty straight-forward. All PKS objects placed in the same Host Group. Let’s now see if we can use Host Groups to do something a little bit more complex.

I’m going to go ahead and create a second AZ, and see if I can place the BOSH Director and PKS VMs into a sort of ‘management’ AZ and then place the Kubernetes clusters in a different ‘production’ AZ. To do this, I have to be able to modify the AZ configuration in Pivotal Ops Manager. This is not possible by default. However, there is a way of enabling an ‘Advanced Mode’ in Enterprise PKS which will enable you to modify the existing AZ configuration. The instructions to enable advanced mode can be found here. Please use with caution, as per the documentation.

Now, after enabling advanced mode, we can create a second AZ from the BOSH tile in Pivotal Ops Manager and have my two AZs refer to two different Host Groups. Host Group PKS-AZ-1 contains 3 ESXi hosts, while Host Group PKS-AZ-2 contains a single ESXi host:

The next step is to make sure that the new AZ has access to a network. I simply allowed the new AZ to use theĀ  existing network already used by the original AZ.

The final configuration step in the BOSH tile is to choose the Availability Zone for the BOSH Director. As mentioned, this test is to place the BOSH Director and PKS VMs onto the same Host Group which contains a single host, and place the K8s clusters onto a Host Group with 3 hosts. Thus I modified the Singleton Availability Zone for the BOSH Director to use the new AZ:

Next, save all the changes on the BOSH Director for vSphere tile. Let’s turn our attention to the Enterprise PKS tile. The first step in the Enterprise PKS tile is to choose the AZ for the PKS jobs, just like we did for BOSH Director previously. I will use CH-AZ-2 once again.

The final step is to select an AZ in each of the plans that you wish to use for the Kubernetes clusters. Enterprise PKS allows you to choose different AZs for both the master and worker nodes. It also allows you to include multiple AZs if you want to have them spread evenly across Availability Zones. I kept this quite simple and selected the same AZ for both master and workers in my plan. This is what my plan looked like from an AZ perspective for the master and workers:

Now all that remains is for me to apply any pending changes in Pivotal Ops Manager. In fact, when I applied the changes, both the BOSH and Enterprise PKS VMs were automatically relocated to the new AZ. When the changes were fully applied, this is how the VM Group configurations looked. The first one is the K8s cluster, with a single master and 4 worker nodes:

This is the Availability Zone/VM Group which contains the BOSH Director and PKS:

Very good – now we can see that different parts of the deployment can be placed in different AZs. Make sure you disable the Advanced Mode once you have completed the configuration.

Availability Zones/Host Groups/vSAN Fault Domains

Pivotal have provided some additional documentation around the use of Host Groups with vSAN Fault Domains here. In a nutshell, you could map vSAN Fault Domains to a Host Group. This means that if you wanted to achieve something like Rack Awareness for your Kubernetes clusters, you could pop each of the hosts in each rack into their own Fault Domain. If this Host Group is then associated with an AZ as shown above, and you decide to distribute the K8s master and K8s workers across all AZs, then you get a Kubernetes cluster that can tolerate complete rack failures. This is pretty cool.

One could also extend this to vSAN stretched clusters, where each site is a fault domain in its own right, providing a maximum of two Fault Domains to work with. As per the documentation, Kubernetes nodes could be placed on either one site or both sites, depending on the requirement. [Update] I’ve since been informed that using Host Groups with vSAN Stretched Clusters is not recommended. I suspect the main reason here is that the K8s control plane should have an odd number of masters > 1 (i.e. at least 3) and vSAN Stretched Cluster only offers 2 Fault Domains. Thus one site would always have 2 masters, and if that site failed, it would bring down the complete K8s cluster. The documented for PKS v1.5 and v1.6 has been updated to remove statements related to vSAN stretched cluster. However it continues to appear in PKS v1.4 at the time of writing. I’ve been informed that a request has gone in to also remove it from the PKS v1.4 docs.

 

3 Replies to “Using Host Groups with Availability Zones (AZs) in Enterprise PKS”

    1. I suspect so – since there is a requirement to have an odd number of Kubernetes master nodes in the control plane, and since a stretched cluster only offers 2 availability zones/sites, then a single site failure could bring your K8s cluster down if it is the site which hosts the majority of master nodes.

      I also heard our Heptio folks (now part of VMware) say that stretched cluster is not the best architecture for K8s. My suspicion is that this is the reason.

      1. The solution to that kind of problem would probably use a witness in a third zone instead of three nodes (eg solution used in VSAN stretched cluster or VMSC). Can imagine people are working on it.

Comments are closed.