With the release of vSphere 7.0U1, vSphere with Kubernetes has been decoupled from VMware Cloud Foundation (VCF). VMware now has two vSphere with Kubernetes offerings, the original VCF based vSphere with Kubernetes offering, now referred to as VCF with Tanzu, and a newer offering outside of VCF, referred to as vSphere with Tanzu. This write-up is to step through the deployment of the new vSphere with Tanzu with HA-Proxy. I won’t cover everything in this single post, but will do a series of 4 posts stepping through the process.
Differences: VCF with Tanzu and vSphere with Tanzu
I thought it useful to begin with a comparison between the two versions. When we promoted the original version of vSphere with Kubernetes on VCF, we used the following slide.
Using this slide, VMware talked about how different services in VCF with Tanzu provides containers natively in vSphere in the form of PodVMs, how NSX-T is used to provide load balancing for both the Supervisor cluster and the TKG clusters, how the vSphere CSI driver and CNS were integrated to provide persistent storage to both the Supervisor cluster and TKG clusters, and also how customers could deploy the Harbor Image Registry on the Supervisor cluster with a single click.
We also spent a lot of time explaining that vSphere with Kubernetes offered developers the vSphere with Kubernetes experience with PodVMs and the Supervisor cluster, but if developers wanted or needed a more native, upstream Kubernetes experience, then this was also available through the deployment of upstream, conformant, VMware engineered and supported Tanzu Kubernetes cluster via a simple manifest file.
I modified the above slide to create the following slide, after seeing something similar during the vSphere 7.0U1 launch. This slide highlights some of the differences in our new vSphere with Tanzu offering.
There are quite a number of differences as you can see. I’ve noted them in the following table.
|VCF with Tanzu
|vSphere with Tanzu
The major take-aways are that you no longer need VCF to deploy vSphere with Tanzu, nor do you need NSX-T to provide load balancers for Kubernetes. Note that both VCF and NSX-T are still supported for use with vSphere with Kubernetes. What is new is that we now use a HA-Proxy to provide load balancers (necessary for the Supervisor control plane API server and TKG cluster API servers). However, without NSX-T, you cannot use the PodVM construct in this release. Since Harbor utilized PodVMs, you no longer have the ability to deploy the embedded registry service. The other major difference is the switch from Calico to Antrea Container Network Interface (CNI) for the TKG clusters.
Note that if you do have NSX-T (v3.x) available, you can continue to use that network stack with vSphere with Tanzu, and in that case you will still have the PodVM construct available.
Requirements for HA-Proxy and vSphere with Tanzu
In this section, the full set of prerequisites required for deploying HA-Proxy and vSphere with Tanzu are listed. It should be noted that there are 2 options when deploying the HA-Proxy – it can be deployed with 2 NICs or 3 NICs. In the 2 NIC configuration, there is a Management network and a Workload network. The Management network communicates with the Supervisor control plane whilst the Workload network is used for communicating to both TKG node IP addresses and providing Load Balancer IP addresses. In the 3 NIC configuration, a third network is configured, called the Frontend network. This moves the Load Balancer IP address range from the Workload network to the FrontEnd network. In the next post, we will discuss this requirement in more detail when we deploy the HA-Proxy, but from a requirements perspective, I am assuming a different Frontend network is being used for the Load Balancer IP addresses.
Note that the TKG control plane nodes (which will be provisioned on the Workload network) will need to be able to communicate to load balancer IP addresses (provisioned on the Frontend network). You can use different subnets/VLANs but you will need to ensure that there is a route between them.
- [Update] A minimum of 3 ESXi hosts in a vSphere cluster is required. I’ve noted that a number of people have struggled to deploy vSphere with Tanzu, especially on 2-node vSAN clusters. The reason that 3 nodes is required is that the vSphere with Tanzu control plane is made up of 3 nodes, and these have anti-affinity rules which means that the need to be placed on different hosts. With less than 3 nodes, the control plane requirements won’t be satisfied.
HA & DRS
- Make sure that both DRS and HA are enabled on the cluster where vSphere with Tanzu will be enabled
- Make sure that you have your desired Storage Policy created for the Supervisor Cluster VMs. I am using vSAN so I will use the default vSAN storage policy. You can create whichever policy is suitable.
- Create a Content Library for HA-Proxy OVA. The HA-Proxy OVA needs to be added to this Content Library and deployed from there. (HA-proxy v0.1.7 from this link)
- Create a Content Library for TKG images. This Content Library is used to store the virtual machine images that will be used for deploying the TKG cluster virtual machines. Use the TKG subscription URL for these images.
- Identify an FQDN and a static IP address for HA-Proxy on the Management network
- Identify a static IP address for HA-Proxy on the Worker Network
- (Optional) Identify a static IP address for HA-Proxy – Frontend Network
- Identify a range of IP addresses for the Supervisor VMs on Management Network
- Identify a range of IP addresses for the Workload network. These will be used by both the Supervisor control plane nodes, as well as nodes that are provisioned for TKG “guest” clusters
- Identify a range of IP addresses for Load Balancers – these can be on either the Workload network or optionally on the Frontend network
- vSphere Distributed Switch (VDS) configured for all hosts in the cluster.
- [Update] Ensure that the Workload network and the FrontEnd network are routable to each other.
- Consider using an IP Scanner for the FrontEnd network. This will verify that the Load Balancer address plumbed up on the HA-Proxy are responding. I use Angry IP Scanner.
- Consider using a CIDR calculator which will help to figure out the correct CIDR mappings for a particular IP range. I use this one quite a bit.
Everything is now in place to complete the remaining tasks:
- Deploy/Configure the HA-Proxy
- Deploy/Configure Workload Management/vSphere with Tanzu
- Create a Namespace, login to vSphere with Tanzu and deploy a TKG cluster