Getting started with VCF 4.0 Part 3 - vSphere with Kubernetes in a Workload Domain

At this point, we have a fully configured workload domain which includes an NSX-T Edge deployment. Check here for the previous VCF 4.0 deployment steps. We are now ready to go ahead and deploy vSphere with Kubernetes, formerly known as Project Pacific. Via SDDC Manager in VMware Cloud Foundation 4.0, we ensure that an NSX-T Edge is available, and we also ensure that the the Workload Domain is sufficiently licensed to enable vSphere with Kubernetes.

Disclaimer: “To be clear, this post is based on a pre-GA version of the VMware Cloud Foundation 4.0. While the assumption is that not much should change between the time of writing and when the product becomes generally available, I want readers to be aware that feature behaviour and the user interface could still change before then.”

Validate Workload Domain

The first step in the process is to validate that the workload domain can support vSphere with Kubernetes. The SDDC Manager now has a new section called Solutions. Currently, there is a single Solution called Kubernetes – Workload Management. To begin with, there are obviously no workload management solutions created:

Click on the Deploy link to get started. The first window that is displayed is the prerequisites – you need the correct license edition, a configured NSX-T based workload domain with an NSX-T Edge cluster and various subnets, 2 of which need to be routed (for Ingress and Egress).

Once you have selected all of the above, we can begin the workload management deployment. In the cluster section, select the workload domain. If there are compatibility issues, the cluster will not be displayed in the compatible section. Instead it will be displayed in the incompatible section. The incompatible view will also display the reason why the cluster is is incompatible. In the screenshot below, my WLD has not been licensed appropriately, and so is not compatible for vSphere with Kubernetes.

Once the incompatible reasons have been addressed (e.g. correct license edition applied to the VI WLD), the WLD will appear in the compatible section and you can proceed to the validation step.

And now the full validation takes place where we ensure that credentials, resources and networking all exist and are available for the rollout of vSphere with Kubernetes.

Once the requirements for vSphere with Kubernetes have been validated, we can connect to our vSphere environment to complete the vSphere with Kubernetes setup. Click the button Complete in vSphere:

vSphere with Kubernetes deployment

After clicking on the ‘Complete in vSphere‘ button, a vSphere client is launched and we get placed at the starting point to Enable Workload Management. There are 5 steps in total. The first step is to select a vSphere cluster since there could be multiple clusters in the same VI Workload Domain. In this example, there is only one, so we select that.

After selecting a cluster, the next step is to choose a control plane size in Cluster Settings. In other words, what resources will be allocated to the Supervisor Kubernetes Cluster control plane virtual machines when they are deployed on the Supervisor Cluster.

If some of this terminology such as Supervisor, Native Pods, Spherelet, etc. is new to you, you can check back on this Project Pacific post for an overview. Project Pacific was the original name for vSphere with Kubernetes.

For availability purpose, 3 x control plane nodes are provisioned, and are called Supervisor Control Plane VMs. The worker nodes, in the case of vSphere with Kubernetes, are the ESXi hosts in the cluster.

The next step is to provide network details for both the control plane nodes and various IP ranges for the workload network, such as Pods and Services. The Pod and Service CIDRs do not need to be routable but the Ingress and Egress certainly do. In the prerequisites it states that a minimum of /27 CIDR is required for both Ingress and Egress. This is a contiguous range of 32 IP addresses. Note that in my lab testing, I was able to deploy vSphere with Kubernetes with /28 which is only 16 IP addresses, of which 14 are usable. On deployment, there were 12 SNAT rules created immediately, so /28 could only be used for the most basic of deployments. Each new namespace will also require a SNAT, so you will be limited to 2 namespaces at maximum. To do something useful with vSphere with Kubernetes you will definitely need the /27 CIDR, at least for the Egress. [Update] Depending on your network configuration, you may also need to add a route between the Egress network to your DNS server to enable your Pods to be able to resolve requests to pull from google, docker or which ever external repositories you wish to use.

The next step is to select storage policies for the various disks required for vSphere with Kubernetes. Basically what you are doing in this step is picking a storage policy for the Control Plane Node disks, Ephemeral (temporary) disks, and the Image Cache.

When you click on the Select Storage, the list of available storage policies is displayed. Simply pick the one that you wish to use for the particular storage type.

Finally, review the selection and finish. This will start the deployment of vSphere with Kubernetes.

Progress can be monitored from the vSphere client, and in particular the Recent tasks.

There are a lot of actions which now take place. Some of these tasks include, but are not limited to:

The deployment of 3 x Supervisor Control Plane VMs
The creation of a set of SNAT rules (Egress) in NSX-T for a whole array of K8s services
The creation of a Load Balancer (Ingress) in NSX-T for the K8s control plane
The installation of the Spherelet on the ESXi hosts so that they behave as Kubernetes worker nodes

From NSX-T we can see the 12 x SNAT rules:

As mentioned earlier, as soon as you begin to build a namespace, a new SNAT rule is created. Thus it is important to get the Egress CIDR set correctly at deployment or you can very quickly run out of addresses.

I mentioned that there is also a Load Balancer created – this is used for the multi-node Control Plane API Server. Rather than connecting to a single control plane node, we can connect to the LB virtual IP address. This means that if there is an issue with any of the control plane virtual machines, it won’t be noticeable. Connections to the API server will simply be redirected to a different control plane node at the back-end. Here is the Ingress/Load Balancer IP address used in my environment for the Control Plane:

You can monitor all the vSphere with Kubernetes deployment activity from a log file called wcpsvc.log on the vCenter server. I normally leave a tail command running to see the detailed activity:

root@vcsa-04 [ ~ ]# cd /var/log/vmware/wcp

root@vcsa-04 [ /var/log/vmware/wcp ]# ls -l
total 18532
-rw------- 1 root root   23102 Mar 20 09:00 gcm-telemetry
-rw-r--r-- 1 root root 9445965 Mar 20 09:02 nsxd.log
-rw------- 1 root root     111 Mar 19 11:05 stdstream.log-0.stderr
-rw------- 1 root root      42 Mar 19 11:05 stdstream.log-0.stdout
-rw------- 1 root root    1865 Mar 19 04:06 stdstream.log-1.stderr
-rw------- 1 root root      42 Mar 18 16:06 stdstream.log-1.stdout
-rw------- 1 root root    3719 Mar 20 08:25 stdstream.log.stderr
-rw------- 1 root root      42 Mar 19 12:20 stdstream.log.stdout
-rw-r--r-- 1 root root  778392 Mar 19 12:09 wcpsvc-2020-03-19T12-09-03.220.log.gz
-rw-r--r-- 1 root root 8687263 Mar 20 09:03 wcpsvc.log

root@vcsa-04 [ /var/log/vmware/wcp ]# tail -f wcpsvc.log
.
.

And if everything goes according to plan, you should see the deployment complete, both in the vSphere client and in SDDC Manager > Solutions > Workload Management:

In the vSphere client, if we navigate to Workload Management > Namespaces in the vSphere UI, we can see a message stating that Workload Management has been successfully deployed and that we can now get started by creating namespaces and other Kubernetes objects on the Supervisor cluster. These can be your typical Kubernetes objects such as Pods, Deployments, StatefulSets, or other things such as a complete deployments of Tanzu Kubernetes Grid (TKG) Guest Kubernetes Clusters.

However that is a job for another day. Hopefully this post has shown how much automation has been put into VMware Cloud Foundation 4.0 to configure, validate and deploy infrastructure that facilitates easy deployment of vSphere with Kubernetes.

To learn more about VMware Cloud Foundation 4, check out the complete VCF 4 Announcement here.

14 Replies to “Getting started with VCF 4.0 Part 3 – vSphere with Kubernetes in a Workload Domain”

Zibi says:

March 25, 2020 at 8:59 pm

Incredible – this really looks fast to deploy. I wonder though if it can be automated even further with scripting – not for speeding things up, but for auditing purposes.
Could you write a bit about Management Network requirements ?
It looks like you need traditional routed VLAN for that. This network needs to have traffic allowed to the DNS, NTP and vCenter. I wonder though what else it may need the connection to – like Harbor registry, or local Git.
Is this network used for K8s developers in order to connect to the cluster ?
1. Cormac says:
  
  March 26, 2020 at 8:34 am
  
  This should all be covered in detail when the official documentation is released. The design is based off of a VVD (VMware Validated Design). I’ll see what I can do though as there seems to be a lot of interest in how NSX-T has been configured for this workload.
Zibi says:

March 26, 2020 at 5:42 pm

This lot of interest is major understatement 🙂
I was briefly involved in the attempt to operiatonalize K8s networking in production environment. This is like Mission Impossible in trying to get it through the security and firewalling depts.
If you managed to get the NSX-T to automagically connect everything in a moment it is created, then this feat alone is worth a lot of money.
It would be great if you could show things like network policies in the K8S interacting with the NSX-T in order to get the desired effect, or how you made VSAN File Services objects being able to be consumed by pods, which as you say are using non routable networks.
1. Cormac says:
  
  March 27, 2020 at 9:36 am
  
  I’ll see what I can do over the coming weeks and months. Thanks for the feedback Zibi.
Glenn says:

April 9, 2020 at 12:38 am

I really like the articles you have written with the step by step screenshots. What hardware do you use in your lab?
1. Cormac says:
  
  April 9, 2020 at 8:50 am
  
  DELL PowerEdge R630s
srmanivel says:

April 22, 2020 at 12:55 pm

Hi Cormac,

I have a question.NSX-T is mandatory for delpoying k8s in vsphere 7.0 ? without NSX-T,we cannot deploy kubernetes in vsphere 7.0 ? is this right ?
1. Cormac says:
  
  April 22, 2020 at 1:12 pm
  
  Correct – NSX-T 3.0 is mandatory.
  1. srmanivel says:
    
    April 22, 2020 at 5:52 pm
    
    Thank you Cormac,
    Manivel
Subhankar Ghose says:

May 3, 2020 at 7:29 am

Hi Comac,

I had an issue once the workload management is deployed successfully. The LB IP of the control plane VMs doesn’t open the http-title: VMware – Download Kubernetes CLI Tools.. However the Individual IP of the control plane VMs works fine..However the LB IP pings fine.. Any suggestion what can be the issue
1. Cormac says:
  
  May 5, 2020 at 9:10 am
  
  I think it is https:// – can you try that?
  1. Subhankar Ghose says:
    
    May 5, 2020 at 11:47 am
    
    I have tried both..but it doesn’t work..Another thing is I am not using VCF …
Penko Ivanov says:

June 17, 2020 at 12:27 pm

Hi Cormac,
Do we need dedicated physical cluster in the VI workload domain for Kubernetes or we can use 1 cluster for Kubernetes and all other workloads ?
1. Cormac says:
  
  June 17, 2020 at 12:40 pm
  
  Hi Penko,
  
  It is even easier now – you can deploy vSphere with Kubernetes on the Management Domain of VCF 4.0 (Consolidated Architecture).
  
  More details here – https://cormachogan.com/2020/05/26/vsphere-with-kubernetes-on-vcf-4-0-consolidated-architecture/