Getting started with vSphere with Tanzu

With the release of vSphere 7.0U1, vSphere with Kubernetes has been decoupled from VMware Cloud Foundation (VCF). VMware now has two vSphere with Kubernetes offerings, the original VCF based vSphere with Kubernetes offering, now referred to as VCF with Tanzu, and a newer offering outside of VCF, referred to as vSphere with Tanzu. This write-up is to step through the deployment of the new vSphere with Tanzu with HA-Proxy. I won’t cover everything in this single post, but will do a series of 4 posts stepping through the process.

Differences: VCF with Tanzu and vSphere with Tanzu

I thought it useful to begin with a comparison between the two versions. When we promoted the original version of vSphere with Kubernetes on VCF, we used the following slide.

Using this slide, VMware talked about how different services in VCF with Tanzu provides containers natively in vSphere in the form of PodVMs, how NSX-T is used to provide load balancing for both the Supervisor cluster and the TKG clusters, how the vSphere CSI driver and CNS were integrated to provide persistent storage to both the Supervisor cluster and TKG clusters, and also how customers could deploy the Harbor Image Registry on the Supervisor cluster with a single click.

We also spent a lot of time explaining that vSphere with Kubernetes offered developers the vSphere with Kubernetes experience with PodVMs and the Supervisor cluster, but if developers wanted or needed a more native, upstream Kubernetes experience, then this was also available through the deployment of upstream, conformant, VMware engineered and supported Tanzu Kubernetes cluster via a simple manifest file.

I modified the above slide to create the following slide, after seeing something similar during the vSphere 7.0U1 launch. This slide highlights some of the differences in our new vSphere with Tanzu offering.

There are quite a number of differences as you can see. I’ve noted them in the following table.

VCF with Tanzu	vSphere with Tanzu
VCF required. NSX-T required. Provides Load Balancer for both Supervisor and TKG, and provides overlays for PodVM to PodVM communication PodVM support – native containers on vSphere Calico CNI for TKG Pod to Pod Traffic Integrated Harbor Image Registry	VCF is not a requirement NSX-T not required. VDS networking now supported HA-Proxy virtual appliance option for provisioning Load Balancers Antrea CNI for TKG Pod to Pod Traffic, (Calico CNI also available) No PodVM support if NSX-T not used No Harbor Image Registry (dependency on PodVM)

The major take-aways are that you no longer need VCF to deploy vSphere with Tanzu, nor do you need NSX-T to provide load balancers for Kubernetes. Note that both VCF and NSX-T are still supported for use with vSphere with Kubernetes. What is new is that we now use a HA-Proxy to provide load balancers (necessary for the Supervisor control plane API server and TKG cluster API servers). However, without NSX-T, you cannot use the PodVM construct in this release. Since Harbor utilized PodVMs, you no longer have the ability to deploy the embedded registry service. The other major difference is the switch from Calico to Antrea Container Network Interface (CNI) for the TKG clusters.

Note that if you do have NSX-T (v3.x) available, you can continue to use that network stack with vSphere with Tanzu, and in that case you will still have the PodVM construct available.

Requirements for HA-Proxy and vSphere with Tanzu

In this section, the full set of prerequisites required for deploying HA-Proxy and vSphere with Tanzu are listed. It should be noted that there are 2 options when deploying the HA-Proxy – it can be deployed with 2 NICs or 3 NICs. In the 2 NIC configuration, there is a Management network and a Workload network. The Management network communicates with the Supervisor control plane whilst the Workload network is used for communicating to both TKG node IP addresses and providing Load Balancer IP addresses. In the 3 NIC configuration, a third network is configured, called the Frontend network. This moves the Load Balancer IP address range from the Workload network to the FrontEnd network. In the next post, we will discuss this requirement in more detail when we deploy the HA-Proxy, but from a requirements perspective, I am assuming a different Frontend network is being used for the Load Balancer IP addresses.

Note that the TKG control plane nodes (which will be provisioned on the Workload network) will need to be able to communicate to load balancer IP addresses (provisioned on the Frontend network). You can use different subnets/VLANs but you will need to ensure that there is a route between them.

vSphere Cluster

[Update] A minimum of 3 ESXi hosts in a vSphere cluster is required. I’ve noted that a number of people have struggled to deploy vSphere with Tanzu, especially on 2-node vSAN clusters. The reason that 3 nodes is required is that the vSphere with Tanzu control plane is made up of 3 nodes, and these have anti-affinity rules which means that the need to be placed on different hosts. With less than 3 nodes, the control plane requirements won’t be satisfied.

HA & DRS

Make sure that both DRS and HA are enabled on the cluster where vSphere with Tanzu will be enabled

Storage Policies

Make sure that you have your desired Storage Policy created for the Supervisor Cluster VMs. I am using vSAN so I will use the default vSAN storage policy. You can create whichever policy is suitable.

Content Libraries

Create a Content Library for HA-Proxy OVA. The HA-Proxy OVA needs to be added to this Content Library and deployed from there. (HA-proxy v0.1.7 from this link)
Create a Content Library for TKG images. This Content Library is used to store the virtual machine images that will be used for deploying the TKG cluster virtual machines. Use the TKG subscription URL for these images.

Network Requirements

Identify an FQDN and a static IP address for HA-Proxy on the Management network
Identify a static IP address for HA-Proxy on the Worker Network
(Optional) Identify a static IP address for HA-Proxy – Frontend Network
Identify a range of IP addresses for the Supervisor VMs on Management Network
Identify a range of IP addresses for the Workload network. These will be used by both the Supervisor control plane nodes, as well as nodes that are provisioned for TKG “guest” clusters
Identify a range of IP addresses for Load Balancers – these can be on either the Workload network or optionally on the Frontend network
vSphere Distributed Switch (VDS) configured for all hosts in the cluster.
[Update] Ensure that the Workload network and the FrontEnd network are routable to each other.

Tools

Consider using an IP Scanner for the FrontEnd network. This will verify that the Load Balancer address plumbed up on the HA-Proxy are responding. I use Angry IP Scanner.
Consider using a CIDR calculator which will help to figure out the correct CIDR mappings for a particular IP range. I use this one quite a bit.

Everything is now in place to complete the remaining tasks:

Good luck!

11 Replies to “Getting started with vSphere with Tanzu”

Art Fewell says:

September 24, 2020 at 3:10 pm

Awesome post Cormac, much needed info. One thing to note, as Tanzu Basic/Standard include TKGm on v6.7u3 and support for Harbor, so if one does not have vSphere pod service enabled they can still use Harbor standalone.
Rene says:

September 25, 2020 at 7:20 am

Hi Cormac, nice post! one question: we have to deliver a small Tanzu-PoC to a “dark site” aka no internet-access. is there a (..documented..) way to download the TKG-images manual and upload it later from laptop directly into the library?
1. Cormac says:
  
  September 25, 2020 at 8:21 am
  
  I have not seen one Rene, but let me ask around for you.
2. Cormac says:
  
  September 25, 2020 at 8:50 am
  
  Rene – is this what you are looking for? https://docs.vmware.com/en/VMware-vSphere/7.0/vmware-vsphere-with-kubernetes/GUID-09612B1E-4497-4EF1-844A-612C0FCC1D4E.html
  1. Rene says:
    
    September 25, 2020 at 9:01 am
    
    wohoo!! perfect!! thanx!
Alex Tanner says:

September 29, 2020 at 9:34 am

Hi Cormac, great post. One thing that I feel could be really hlepful is perhaps elaborating on the implications of the ‘loss’ of the PodVM. you mention it in the context of the Harbor deployment, but does this mean that basically everythign by way of pod workloads must now be deployed in a TKG Service environment – with nothing running on the Supervisor Cluster per se? It would be very helpful with perhaps the aid of a diagram to show what has changed? Many thanks
1. Cormac says:
  
  September 29, 2020 at 11:53 am
  
  That is exactly right Alex. The PodVM (aka native containers) is no longer available, unless you use NSX-T (which you can still do btw).
  However, the concept of namespace, and the simple way of deploying TKG “guest” clusters is still there with the HA-Proxy, and I think these alone are excellent features.
Tristan says:

October 8, 2020 at 7:37 pm

Hi Cormac, thanks a lot it clarifies lot’s of things.
Quick question, if we don’t have NSX-T yet, can we deploy the Workload Management on a cluster using vSphere network, and when we finally have NSX-T implemented upgrade it to use vSphere Pods ? Or can we just create another Workload Management on a specific cluster and use NSX-T from the beginning ?
Thanks
1. Cormac says:
  
  October 13, 2020 at 12:04 pm
  
  I *think* it is one or the other Tristan – I’m not sure how you would replace the underlying network layer to be honest. However, I will try to find a definitive answer for you.
Chance says:

November 6, 2020 at 6:44 pm

With the 2-nic configuration of the HA Proxy, can all your IPs be in the same subnet in a simple PoC lab setup?
1. Cormac says:
  
  November 8, 2020 at 9:08 am
  
  I’ve not tested having all 3 networks on the same subnet.
  I’ve only tested having the Management Network on its own subnet, and the Workload + Frontend network on their own network.
  
  Not sure if you will run into issues if you put everything onto one network.