vSphere with Tanzu revisited in vSphere 7.0U3c

Now that VMware has recently released vSphere 7.0U3c, there have been a number of enhancements to vSphere with Tanzu and the TKG Service. Some of these enhancements have been described in recent posts, such as the new v1alpha2 Tanzu Kubernetes Cluster format, as well as new capabilities to the Namespace Service. In this post, I want to go back to basics and look at some changes to the vSphere with Tanzu installation and setup experience. One of the major enhancements is in the area of networking, with DHCP support added for both the Management networks and the Workload network(s). The Management network is where the Supervisor nodes that make up the vSphere with Tanzu control plane are connected. The Supervisor nodes are also connected to the Workload network, but this network is also where the control plane and worker nodes of the Tanzu Kubernetes workload clusters are connected when deployed via the TKG service.

The vSphere 7.0U3c release includes support for Kubernetes v1.21 in the Supervisor Cluster. It also enables a vSphere administrator to edit the configuration of certain network settings after deployment, and in the case of the workload network, administrators can extend the service IP range as well as add new workload networks. This latter feature is something that I have heard requested a number of times.

I am not going to go through the steps of creating a vSphere cluster, enabling vSAN, setting up Storage Policies or deploying an NSX Advanced Load Balancer (ALB). There are number posts on this site which cover each of these independent steps. I will be using vSphere Distributed Switch networking with the NSX ALB and not an NSX-T networking stack in this post, but NSX-T is another networking stack option.

I will start from the point of selecting Workload Management in the vSphere UI. This is the landing page which takes you through the requirements and pre-requisites.

I am going to assume that the underlying vSphere infrastructure (e.g. Distributed Switch, NSX ALB, vSphere cluster) meets all of these requirements. Clicking the “Get Started” button takes me to the selection of vCenter server and the networking stack, as show below:

As mentioned earlier, I am not using NSX-T. Instead I will use a vSphere Distributed Switch and an NSX ALB (already deployed) to provide Load Balancing functionality to vSphere with Tanzu. Note the yellow warning at the top of the UI which states that this Load Balancer must be deployed before proceeding any further. This is message 1 of 2. The second message is highlighting that NSX-T is not available, and that vSphere Distributed Switch networking is the only option available.

Moving to the next step, select a vSphere Cluster for deploying the Supervisor Cluster. The UI will place clusters into either compatible or incompatible buckets. A cluster might be incompatible if it does not have a vSphere Distributed Switch configured, for example. Fortunately, I have a number of compatible vSphere clusters to choose from, as shown below. Three Supervisor nodes will be deployed as virtual machines on the selected cluster, each placed on a different ESXi hosts. Thus a minimum of 3 nodes are required in the vSphere cluster to proceed, although 4 ESXi hosts are recommended for resiliency during patching and upgrading.

The next step is to choose a Storage Policy for the Supervisor nodes. I have enabled vSAN, so in this example, the default vSAN storage policy is chosen. However, vSAN is not a requirement, and you could create a policy that selects any other shared storage in your vSphere cluster.

Next, the Supervisor cluster needs to be told about the Load Balancer. For the NSX ALB, the wizard requires a name, as well as the IP address and port, login username and password and SSL certificate. The SSL certificate can be found on the NSX ALB under Templates > Security > SSL/TLS Certificates. This certificate should have been created as part of the setup of the NSX ALB. Click on the encircled down arrow icon on the right hand sign, and this will popup a window that has both the key and certificate. There is an option to copy the certificate.

Once all the relevant NSX-ALB information is populated, you can continue with the rest of the setup. Note that the NSX ALB uses port 443, whilst the HA-Proxy uses port 5556. Be sure to include the port in the string.

We now come to one of the newer features, which is the ability to use DHCP to provide IP addresses for certain networks. The first is the management network. If DHCP is selected, you no longer have to provide a range of IP address for the Supervisor nodes. The only additional setting is the selection of a distributed portgroup from the vSphere Distributed Switch, and of course, ensure that there is indeed a DHCP service available on that network. Here is how it looks:

DHCP can also be used for the Workload network, which is the final network that needs configuring. Remember that the workload network is used by both the Supervisor nodes (which are multi-homed on both the management network and the workload network) and the workload Tanzu Kubernetes clusters’ control plane and worker nodes. In this example however, I am going to implement a static configuration for comparison. Later we will see how this can be modified and extended, as well as new workload networks added after the setup is completed. Important: One common gotcha with these deployments is that the workload network(s) need to have a route to the load balancer network. Once the API server is plumbed up with the front-end load balancer virtual IP address on the Supervisor control plane, it needs to be able to reach the workload IP address of each of the control plane nodes. If this is not configured correctly, the Supervisor control plane will never come online.

Once the network setup is completed, the UI prompts for a Content Library. This is where the Tanzu Kubernetes Release (TKr) images are stored. This content library is synced to an upstream VMware repository, which will provide the latest images for building Tanzu Kubernetes Clusters (TKC). You can read more about how to build the subscribed Content Library here. These TKrs are added as part of the YAML manifests for workload cluster deployments, and allows DevOps to select the different versions of K8s that they wish to deploy. The content library in my example is called Kubernetes but it can be called pretty much anything you wish.

We now come to the final step, which is choosing a size for the Supervisor cluster control plane node VMs, as well as an optional API server DNS Name(s).

You can now click “Finish” to deploy the Supervisor cluster. This will begin the deployment of three virtual machines that make up the Supervisor control plane.

Supervisor Cluster Observations

When the Supervisor cluster is provisioned with the NSX ALB, you should notice that the VMs initially get deployed with a single NIC on the management network. However, one of the VMs gets a second IP address which is a floating IP address and acts as the etcd leader in the control plane. It means that is any of the supervisor control plane nodes fail, it doesn’t impact communication since connectivity between vCenter and etcd in the SV cluster is via this floating IP, which can be hosted on any of the control plane nodes.

The control plane VMs then go under a reconfiguration, have a second NIC added, which are connected to the workload network. Thus, one of the Supervisor cluster control plane nodes should have a network configuration with 5 IP addresses – 3 x IPv4 addresses and 2 x IPV6 addresses. The IPv4 address are the nodes address on the management network and the workload network, as well as the etcd leader IP address on the management network.

The other two control plane nodes should show 4 IP addresses, 2 x IPv4 and 2 x IPv6. The only difference is that these nodes do not have the etcd leader IP address.

If you navigate to the Workload Management view, and select the Supervisor Clusters view, it should probably still be in a state of “Configuring”. It should also have a Control Plane Node Address that matches the etcd leader IP address, indicating that it has not yet been allocated a Load Balancer IP address from the NSX ALB.

All going well, soon afterwards a change in the Control Plane Node Address should appear and this time it will be an IP address allocated from the NSX ALB from the range of VIPs (Virtual IPs) that were configured on the NSX ALB during setup.

A few moments after this update to the Control Plane Node Address, the Config Status should change from Configuring to Running.

Navigate to the Namespace view, and you should now observe the following success message and the ability to create namespaces.

Kubernetes CLI Tools

So far, so good. There is one final step though, and that is to download the Kubernetes CLI Tools which allows you (and you DevOps team) to interact with vSphere with Tanzu from the command line, including the ability to create TKG workload clusters. Even if you already have a set of CLI tools from previous deployments, you should always download the latest tools to ensure you get full functionality. To download the tools, simply point a browser to the IP address of the Control Plane Node Address seen previously using https://. Alternatively, select the namespace from the vSphere UI, and under Summary tab, in the Status window, links to the CLI tools are available as show below:

Clicking on open will display the landing page where various tool distributions can be downloaded.

Once the tools (kubectl, kubectl-vsphere) have been downloaded and installed, these can then be used to interact with the Supervisor cluster. I won’t go into too much detail here, but I will just show you help output followed by the login steps.

% kubectl-vsphere login --server=https://xx.xx.62.18 \
--vsphere-username administrator@vsphere.local \
--insecure-skip-tls-verify -h

Authenticate user with vCenter Namespaces:
To access Kubernetes, you must first authenticate against vCenter Namespaces.
You must pass the address of the vCenter Namspaces server and the username of
the user to authenticate, and will be prompted to provide further credentials.

Usage:
kubectl-vsphere login [flags]

Examples:
kubectl vsphere login --vsphere-username user@domain --server=https://10.0.1.10

Flags:
-h, --help help for login
--insecure-skip-tls-verify Skip certificate verification (this is insecure).
--server string Address of the server to authenticate against.
--tanzu-kubernetes-cluster-name string Name of the Tanzu Kubernetes cluster to login to.
--tanzu-kubernetes-cluster-namespace string Namespace in which the Tanzu Kubernetes cluster resides.
-u, --vsphere-username string Username to authenticate.

Global Flags:
--request-timeout string Request timeout for HTTP client.
-v, --verbose int Print verbose logging information.

% kubectl-vsphere login --server=https://xx.xx.62.18 \
--vsphere-username administrator@vsphere.local \
--insecure-skip-tls-verify

Logged in successfully.

You have access to the following contexts:
xx.xx.62.18

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`
%

% kubectl get nodes
NAME STATUS ROLES AGE VERSION
4224646f6ea51802fa48461a93654e89 Ready control-plane,master 157m v1.21.0+vmware.wcp.2
422480e1218564cc78909921aff30361 Ready control-plane,master 157m v1.21.0+vmware.wcp.2
42249d40784ee8e471b38bc78c883fc8 Ready control-plane,master 162m v1.21.0+vmware.wcp.2

You are now ready to create namespaces using the Namespace Service, build VMs using the VM service or created Tanzu Kubernetes workload clusters using the TKG service.

To close, I wanted to mention that a similar service is now available in VMware Cloud. VMware Cloud allows vSphere administrators to deploy Tanzu Kubernetes clusters without having to manage the underlying SDDC infrastructure. Read more about Tanzu Services here.

Supervisor Cluster Observations

Kubernetes CLI Tools

Published by Cormac