Building a TKG Cluster in vSphere with Kubernetes

Cormac

4 years ago

Now that we have our vSphere with Kubernetes deployed, we take the next logical step in this post and deploy a Tanzu Kubernetes Grid (TKG) guest cluster. [Update] Whilst guest cluster isn’t an official name for the Tanzu Kubernetes cluster, I’ll use it in this post to differentiate it from the Supervisor cluster deployed with vSphere with Kubernetes. TKG is a full CNCF certified Kubernetes distribution. It is deployed as a set of virtual machines, in accordance with a TanzuKubernetesCluster manifest which we will look at later. The OS and K8s distribution is also specified in the manifest. There may be many TKG guest clusters deployed on the same vSphere with Kubernetes infrastructure. Isolation/Multi-Tenancy is achieved through namespaces. Multiple namespaces may be created with one or more TKG guest clusters in each. A namespace has its own set of vSphere resources. This then enables multiple developers or development teams to work simultaneously on the same vSphere platform, whilst namespaces ensure their respective activities do not impact each other or other tenants of the platform.

Disclaimer: “Once again, I want to be perfectly clear. This post is based on a pre-GA version of vSphere with Kubernetes. While the assumption is that not much should change between the time of writing and when the product becomes generally available, I want readers to be aware that feature behaviour and the user interface could still change before then.”

In this post, I will go through the following steps to deploy a TKG guest cluster in my already configured vSphere with Kubernetes environment.

Create a new namespace for my TKG guest cluster
Create a Content Library and sync it to the TKG supported image(s) for virtual machines
Ensure the the image is visible in my namespace
Deploy a TKG guest cluster
Change to the new TKG guest cluster context
Create a simple PVC/PV to show enhancements to CNS (Cloud Native Storage) to handle TKG guest cluster volumes in vSphere with Kubernetes

Step 1. Create a new namespace

We’ve already seen how to do this in the earlier vSphere with Kubernetes post. The process is the same. Here are my existing namespaces:

Next, I create a new Namespace, this time called tkg-guest-01.

And here is my successfully created namespace where I plan to deploy the TKG guest cluster:

The next step is to Add Storage. As I did in the previous post, I am once again going to use the vSAN default storage policy. Remember that this policy shows up as a Kubernetes Storage Class in our namespace.

And that completes the namespace setup. Obviously you can modify the permissions and limits associated with the namespace if you so wish, but I’m going to leave them at the defaults for this exercise. Let’s move onto the Content Library next.

Step 2. Configure a Content Library

Next step is to create a content library so that we can sync it to the external TKG images. This is relatively straight forward, but I’ll include it here for completeness. First thing to do is to provide a name for the Content Library. I have always used the name Kubernetes, as this was the guidance given back in the vSphere with Kubernetes beta days. I’m not 100% completely sure if other names can be used. I guess the official docs will tell us at GA time. [Update] And they do – any name can be used. The other step here is to choose a vCenter Server. Since this environment was deployed using VMware Cloud Foundation (VCF) 4.0, there will be at least 2 vCenter Servers – one for the Management Domain and one for the Workload Domain (WLD) where we deployed vSphere with Kubernetes. It is the WLD vCenter Server that should be chosen.

The next step is to select Subscribed content library and provide the URL. I’ve blanked it out here, because as sure as eggs are eggs, it will change again for GA. Once the Subscription URL is confirmed, I’ll come back and update it. This again will be confirmed in the GA documentation. [Update] The URL is now confirmed in the docs – it is https://wp-content.vmware.com/v2/latest/lib.json

Select a datastore on which to store the Content Library. I’ve chosen my vSAN datastore.

Review the settings and Finish.

After a moment, we should start seeing Sync tasks appearing in the vSphere UI task bar.

We will come back to this later and query the available images from the kubectl CLI. However, there is one other important step, and that is to connect the Content Library to our vSphere cluster. Select the vSphere Cluster > Configure > Namespace > General and click on the Content Library Add Library button.

Select the Kubernetes Content Library that we just created, and click OK to add it to the cluster.

The General view should now look something like this.

That completes the work needed in the vSphere UI for the moment. Let’s now head down to the command line where we will do the TKG guest cluster deployment.

Step 3. Deploy the TKG guest cluster

We looked at some of the command line tools when we looked at the vSphere with Kubernetes deployment previously. There are two – kubectl and kubectl-vsphere. The former is the general Kubernetes command line tool for communicating with the API server. The latter is a vSphere with Kubernetes specific tool, primarily used for authentication. To begin, we login, list the nodes, the namespaces and set our context to our new namespace we created earlier in step 1. We then make sure that the StorageClass is available and that the TKG guest cluster virtual machine image is synced and available in the Content Library. This image is used to create the control plane VM and worker node VMs in the TKG guest cluster.

$ kubectl-vsphere login --vsphere-username administrator@vsphere.local --server=20.0.0.1 \
--insecure-skip-tls-verify

Password: ***************
Logged in successfully.

You have access to the following contexts:
   20.0.0.1
   cormac-ns
   tkg-guest-01

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`


$ kubectl get nodes
NAME                               STATUS   ROLES    AGE   VERSION
423aace716b34ba158132148a4d1cb47   Ready    master   24h   v1.16.7-2+bfe512e5ddaaaa
423ac0a8088af7f83a732d1e317296d1   Ready    master   24h   v1.16.7-2+bfe512e5ddaaaa
423ae8cd4e4a6d6cb36858057c11a78b   Ready    master   24h   v1.16.7-2+bfe512e5ddaaaa
esxi-dell-g.rainpole.com           Ready    agent    24h   v1.16.7-sph-4d52cd1
esxi-dell-j.rainpole.com           Ready    agent    24h   v1.16.7-sph-4d52cd1
esxi-dell-l.rainpole.com           Ready    agent    24h   v1.16.7-sph-4d52cd1


$ kubectl get ns
NAME                                STATUS   AGE
cormac-ns                           Active   21h
default                             Active   3d
kube-node-lease                     Active   3d
kube-public                         Active   3d
kube-system                         Active   3d
tkg-guest-01                        Active   34m
vmware-system-capw                  Active   3d
vmware-system-csi                   Active   3d
vmware-system-kubeimage             Active   3d
vmware-system-nsx                   Active   3d
vmware-system-registry              Active   3d
vmware-system-registry-1812432932   Active   21h
vmware-system-tkg                   Active   3d
vmware-system-ucs                   Active   3d
vmware-system-vmop                  Active   3d

$ kubectl config get-contexts
CURRENT   NAME           CLUSTER    AUTHINFO                                   NAMESPACE
*         20.0.0.1       20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
          cormac-ns      20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
          tkg-guest-01   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   tkg-guest-01


$ kubectl config use-context tkg-guest-01
Switched to context "tkg-guest-01".


$ kubectl config get-contexts

CURRENT   NAME           CLUSTER    AUTHINFO                                   NAMESPACE
          20.0.0.1       20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
          cormac-ns      20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
*         tkg-guest-01   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   tkg-guest-01


$ kubectl get sc
NAME                          PROVISIONER              AGE
vsan-default-storage-policy   csi.vsphere.vmware.com   38m


$ kubectl get virtualmachineimages
NAME                                            AGE
photon-3-k8s-v1.16.8---vmware.1-tkg.1.6b5edc7   44m

Looks like everything is in order. We’ve switched to the new namespace, and have verified that the Storage Class and Virtual Machine Image are both available. We can now proceed with deploying the TKG guest cluster. This is the manifest that I am using to deploy the cluster.

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
name: ch-tkg—cluster01
spec:
topology:
   controlPlane:
     count: 1
     class: guaranteed-small
     storageClass: vsan-default-storage-policy
   workers:
     count: 3
     class: guaranteed-small
     storageClass: vsan-default-storage-policy
distribution:
   version: v1.16

In this manifest, we have request a single control plane node and 3 worker nodes. We will use the vsan-default-storage-policy for the Storage Class as it is the only one we configured in this namespace. The size of the nodes is set to guaranteed-small (a full list of options will appear in the GA documentation). Guaranteed is similar to resource reservations on vSphere. You can also display the available options running the following command:

$ kubectl get virtualmachineclasses
NAME                AGE
best-effort-large   46h
best-effort-medium  46h
best-effort-small   46h
best-effort-xlarge  46h
best-effort-xsmall  46h
guaranteed-large    46h
guaranteed-medium   46h
guaranteed-small    46h
guaranteed-xlarge   46h
guaranteed-xsmall   46h

You can then describe the class that you are interested in for more details, including resources (highlighted in blue below).

$ kubectl describe virtualmachineclasses guaranteed-small
Name:         guaranteed-small
Namespace:
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"vmoperator.vmware.com/v1alpha1","kind":"VirtualMachineClass","metadata":{"annotations":{},"name":"guaranteed-small"},"spec"...
API Version:  vmoperator.vmware.com/v1alpha1
Kind:         VirtualMachineClass
Metadata:
  Creation Timestamp:  2020-04-06T08:51:21Z
  Generation:          1
  Resource Version:    1342
  Self Link:           /apis/vmoperator.vmware.com/v1alpha1/virtualmachineclasses/guaranteed-small
  UID:                 e6a23ad0-d67c-42ac-8a4c-34af6ac2ee07
Spec:
  Hardware:
    Cpus:    2
    Memory:  4Gi
  Policies:
    Resources:
      Requests:
        Cpu:     2000m
        Memory:  4Gi
Events:          <none>

Finally, we have chosen distribution version v1.16 which matches the VM image in our Content Library; you do not have to put the full image name in the manifest. Looks like we are good to go.

$ kubectl apply -f cormac-tkg-cluster-01.yaml
tanzukubernetescluster.run.tanzu.vmware.com/ch-tkg--cluster created

Now there are a number of things that are going to go on under the covers to deploy the TKG guest cluster. A number of the moving parts are discussed in this Project Pacific post from VMworld 2019. These involve Guest Cluster Manager, Cluster API and the VM Operator. I’m not going to delve into the details here, but check back on the Pacific post if your are interested in how these components work together to deploy a TKG guest cluster.

4. Review the TKG deployment via kubectl

Let’s see how the deployment has progressed. First let’s look at the cluster.

$ kubectl get TanzuKubernetesCluster
NAME               CONTROL PLANE   WORKER   DISTRIBUTION                     AGE   PHASE
ch-tkg-cluster01   1               3        v1.16.8+vmware.1-tkg.3.60d2ffd   12m   running

And we can also query the VMs that back the control plane and nodes:

$ kubectl get VirtualMachines
NAME                                              AGE
ch-tkg-cluster01-control-plane-82qdg              11m
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-6vg6h   5m33s
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-hrrcj   5m32s
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-qs6mg   5m32s

What is very interesting is a describe against the cluster. We can see the topology (1 control plane node and 3 worker nodes). We can see the distribution version. We can also see the various addons used in the cluster. Note the two items highlighted in blue below. I will discuss those in a bit more detail next.

$ kubectl describe TanzuKubernetesCluster ch-tkg-cluster01
Name:         ch-tkg-cluster01
Namespace:    tkg-guest-01
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"run.tanzu.vmware.com/v1alpha1","kind":"TanzuKubernetesCluster","metadata":{"annotations":{},"name":"ch-tkg-cluster01","name...
API Version:  run.tanzu.vmware.com/v1alpha1
Kind:         TanzuKubernetesCluster
Metadata:
  Creation Timestamp:  2020-04-02T12:50:56Z
  Finalizers:
    tanzukubernetescluster.run.tanzu.vmware.com
  Generation:        1
  Resource Version:  1702313
  Self Link:         /apis/run.tanzu.vmware.com/v1alpha1/namespaces/tkg-guest-01/tanzukubernetesclusters/ch-tkg-cluster01
  UID:               50ba1cfd-dd88-4f71-90f9-28fcc1303df9
Spec:
  Distribution:
    Full Version:  v1.16.8+vmware.1-tkg.3.60d2ffd
    Version:       v1.16
  Settings:
    Network:
      Cni:
        Name:  calico
      Pods:
        Cidr Blocks:
          192.168.0.0/16
      Service Domain:  cluster.local
      Services:
        Cidr Blocks:
          10.96.0.0/12
  Topology:
    Control Plane:
      Class:          guaranteed-small
      Count:          1
      Storage Class:  vsan-default-storage-policy
    Workers:
      Class:          guaranteed-small
      Count:          3
      Storage Class:  vsan-default-storage-policy
Status:
  Addons:
    Cloudprovider:
      Name:     vmware-guest-cluster
      Status:   applied
      Version:  v1.16.8+vmware.1-tkg.3.60d2ffd
    Cni:
      Name:     calico
      Status:   applied
      Version:  v1.16.8+vmware.1-tkg.3.60d2ffd
    Csi:
      Name:     pvcsi
      Status:   applied
      Version:  v1.16.8+vmware.1-tkg.3.60d2ffd
    Dns:
      Name:     CoreDNS
      Status:   applied
      Version:  v1.6.5_vmware.2
    Proxy:
      Name:     kube-proxy
      Status:   applied
      Version:  1.16.8+vmware.1
    Psp:
      Name:     defaultpsp
      Status:   applied
      Version:  v1.16.8+vmware.1-tkg.3.60d2ffd
  Cluster API Status:
    API Endpoints:
      Host:  20.0.0.3
      Port:  6443
    Phase:   provisioned
  Node Status:
    Ch - Tkg - Cluster 01 - Control - Plane - 82 Qdg:                         ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - 6 Vg 6 H:  ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - Hrrcj:     ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - Qs 6 Mg:   ready
  Phase:                                                                      running
  Vm Status:
    Ch - Tkg - Cluster 01 - Control - Plane - 82 Qdg:                         ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - 6 Vg 6 H:  ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - Hrrcj:     ready
    Ch - Tkg - Cluster 01 - Workers - L 5 Rxp - 6 Bb 6 C 57 C 49 - Qs 6 Mg:   ready
Events:                                                                       <none>

CNI – Calico

In the Addons section above, CNI (Container Network Interface), there is a reference to Calico. This is because in TKG clusters, Calico is being used to provide Pod to Pod communication. The CIDRs for the Pods and Services are visible further back in the describe output.

CSI – pvCSI

Also in the Addons section above, CSI (Container Storage Interface), there is a reference to the paravirtual CSI or pvCSI. This is the modified version of the CSI driver for TKG guest clusters. The reason it is called pvCSI is because it “proxies” requests from the guest cluster to the supervisor cluster which in turn communicates to vCenter and CNS to create persistent volumes (first class disks, VMDKs) on the appropriate vSphere storage. Shortly we will see some updates to CNS which reflect this level of indirection.

5. Review the TKG deployment via UI

From a UI perspective, we can now see the TKG cluster deployed in the tkg-guest-01 namespace. We can also see the control plane node and the three worker nodes.

If we select the tkg-guest-01 Namespace > Configure > VMware Resources > Tanzu Kubernetes, we can see more details about the TKG cluster. Note that the API Server’s Load Balancer IP address (20.0.0.3) is provided from an Ingress range that we provided during the initial vSphere with Kubernetes deployment.

Thus, in this release, we have NSX-T taking care of the north-south traffic from the TKG cluster (Igress and Egress for Load Balancers and SNAT), and Calico taking care of the east-west traffic. Pods get allocated IP addresses from the 192.168.0.0/16 range and Services get allocated IP addresses from the 10.96.0.0/12 range by Calico.

One last thing to show you is the VMware Resources > Virtual Machines. where we can see details about the TKG cluster node VMs, including the manifest for the VM class.

The VM class can be view to see details about how the node was configured, including its resource guarantee.

6. Change context to TKG, operate at guest cluster level

Now we switch contexts. Rather than use the namespace context, we switch context to the TKG cluster. This enables us to run operations in the context of the guest cluster. To do this, log out and log back in, specifying the TKG cluster namespace and cluster name in the login command. The login is a rather long command, as you can see below.

$ kubectl-vsphere logout
Your KUBECONFIG context has changed.
The current KUBECONFIG context is unset.
To change context, use `kubectl config use-context <workload name>`
Logged out of all vSphere namespaces.


$ kubectl-vsphere login --vsphere-username administrator@vsphere.local \
--server=20.0.0.1 --insecure-skip-tls-verify --tanzu-kubernetes-cluster-namespace=tkg-guest-01 \
--tanzu-kubernetes-cluster-name=ch-tkg-cluster01

Password: **************
Logged in successfully.

You have access to the following contexts:
   20.0.0.1
   ch-tkg-cluster01
   cormac-ns
   tkg-guest-01
   tkg-guest-02


If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`


$ kubectl config get-contexts
CURRENT   NAME               CLUSTER    AUTHINFO                                   NAMESPACE
          20.0.0.1           20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
*         ch-tkg-cluster01   20.0.0.3   wcp:20.0.0.3:administrator@vsphere.local
          cormac-ns          20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
          tkg-guest-01       20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   tkg-guest-01
          tkg-guest-02       20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   tkg-guest-02

Now that we are in the context of the cluster, and kubectl commands should only apply to this context. For example, when I run some kubectl commands, the output should reflect the TKG guest cluster and not the Supervisor cluster. Let’s display the K8s nodes of the TKG to prove my point. As you can see, the output now reflects the TKG cluster. Compare this to the list of nodes we displayed when we first logged in to the namespace back in step 3.

$ kubectl get nodes -o wide
NAME                                              STATUS   ROLES    AGE     VERSION            INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION      CONTAINER-RUNTIME
ch-tkg-cluster01-control-plane-82qdg              Ready    master   12m     v1.16.8+vmware.1   10.244.1.18   <none>        VMware Photon OS/Linux   4.19.97-5.ph3-esx   docker://18.9.9
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-6vg6h   Ready    <none>   4m34s   v1.16.8+vmware.1   10.244.1.20   <none>        VMware Photon OS/Linux   4.19.97-5.ph3-esx   docker://18.9.9
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-hrrcj   Ready    <none>   4m40s   v1.16.8+vmware.1   10.244.1.19   <none>        VMware Photon OS/Linux   4.19.97-5.ph3-esx   docker://18.9.9
ch-tkg-cluster01-workers-l5rxp-6bb6c57c49-qs6mg   Ready    <none>   4m34s   v1.16.8+vmware.1   10.244.1.21   <none>        VMware Photon OS/Linux   4.19.97-5.ph3-esx   docker://18.9.9

7. CNS in TKG guest clusters in vSphere with Kubernetes

Earlier we saw the reference to pvCSI and we mentioned how it integrates with CNS to show both Supervisor Cluster and TKG guest cluster information. Let’s take a quick look at that now. While this cluster can now be used in the same way as a standard Kubernetes deployment, I am only going to use a very simple PVC (Persistent Volume Claim) manifest file to create a standalone PV (Persistent Volume) to show the new CNS capabilities.

apiVersion: v1 
kind: PersistentVolumeClaim 
metadata:
  name: tkg-pvc
spec:
 storageClassName: vsan-default-storage-policy
 accessModes:
   - ReadWriteOnce
 resources:
   requests:
     storage: 1Gi

We now go ahead and create the PV on our TKG guest cluster.

$ kubectl apply -f pvc.yaml
persistentvolumeclaim/tkg-pvc created


$ kubectl get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
tkg-pvc   Bound    pvc-48317d1c-7f72-4dfd-bf35-8f7b3e930204   1Gi        RWO            vsan-default-storage-policy   30s


$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS                  REASON   AGE
pvc-48317d1c-7f72-4dfd-bf35-8f7b3e930204   1Gi        RWO            Delete           Bound    default/tkg-pvc   vsan-default-storage-policy            13s

Now, remember, this is a Persistent Volume that we requested on the TKG guest cluster. This request needs to make its way to the Supervisor cluster, so that it can communicate with the vCenter server and CNS and request the volume create operation as well as retrieve and store information about the volume. There is a Persistent Volume Claim at the Guest level and a corresponding Persistent Volume Claim at the Supervisor level. The CNS UI in vSphere with Kubernetes shows both when a PV is created in a TKG cluster.

If the select the vSphere cluster object, navigate to Monitor > Cloud Native Storage > Container Volume, the Persistent Volume is visible. We can see some information about the PV.

Next, if we click on the volume for more details, we see additional Kubernetes related information, including information about this Persistent Volume at both the Supervisor and Guest cluster level.

So VI Admins have the ability to trace a K8s volume created on a Guest Cluster all the way back to the Supervisor cluster and vSphere. I’ve only started looking at this behaviour myself and I hope to do a lot more with it going forward.

That concludes this post on a first look at Tanzu Kubernetes Grid, deployed as a guest cluster on vSphere with Kubernetes. Let me know what you think.