vSphere Admin – Requirements Review
Let’s begin with a look at the requirements. This will be a vSphere administrators responsibility to ensure everything is at the correct version. This feature is only available with vSphere 7.0 U2a which released on April 27th, 2021. After upgrading the vCenter server, a vSphere admin will also need to implement a rolling upgrade of the vSphere with Tanzu Supervisor cluster, taking it to version v1.19.1+vmware.2-vsc0.0.9-17882987. Note that there is no requirement on NSX-T to use this feature. In my setup, I am using vSphere Distributed Switches with the NSX ALB to provide load balancing services to my clusters and applications. Here are the versions that I used to create this post:
- vCenter 7.0 U2a, build 17920168
- ESXi – VMware ESXi, 7.0 U2, build 17630552
- VMware NSX Advanced Load Balancer version 2.0.1.4 (build 9087)
- vSphere with Tanzu Supervisor Cluster version v1.19.1+vmware.2-vsc0.0.9-17882987
New Services feature in Workload Management
After upgrading (or initially installing) the above versions, possibly the first thing a vSphere admin will notice is that there is a new Service tab available in the Workload Management section of the vSphere client UI.
If there are existing Namespaces, a vSphere admin will also notice a new section in the Namespace Summary called VM Service. Here is an example from one of my existing namespaces which already has a Tanzu Kubernetes cluster running.
vSphere Admin – VM Service Management
In the Workload Management view seen previously, there is a link to Manage the VM Service. This is where a vSphere admin can manage and monitor two items related to the VM Service, namely VM Classes and Content Libraries. A VM Class defines the resources that are allocated to a virtual machine provisioned by the VM service, and whether or not these should be reserved resources such as CPU and Memory, or whether the resources should be shared with other objects that require similar resources. A vSphere admin will need to give careful consideration as to how resources are allocated to VMs provisioned by developers via the VM Service, taking into account any potential conflicts that might arise due to excessive VM provisioning by developers overloading the system. More on this later.
A Content Library is used to store the images of the guest operating system that will be installed in the virtual machine by the VM Service. In this initial release of the VM Service, there is only a single guest OS supported – a Centos 8 distribution. VMware provides this image. It can be downloaded from the VMware Cloud Marketplace and added to your VM Service Content Library. Here is a direct link to the VM Service Image for Centos. I expect additional distributions will be added going forward.
I will not spend too much more time on VM Classes or Content Libraries as my colleague Myles has done an excellent write-up on these steps which can be found here. And if you are interested in creating some bespoke VM Classes, my other colleague Frank has done an awesome overview of the process on his blog. There is also quite a considerable amount of official documentation on the process available here. On my environment, this is what the Overview, VM Classes and Content Library views currently look like in VM Service > Manage.
In the Overview, the 16 VM Classes are the classes that are available for use are the built-in ones. However, as mentioned, a vSphere admin can create their own bespoke classes if required. The 3 VMs running VM Classes are referencing the Tanzu Kubernetes cluster already deployed in this namespace, which is made up of one control plane node and two worker nodes. If you are not running a Tanzu Kubernetes cluster in your environment, then this field will show 0.
The VM Classes view shows more details about the classes, and how to create new ones. Each class has a Manage drop-down associated with it, which allows you to Edit or Delete the class. I recommend not editing the existing classes but rather create your own new, well-named VM classes, as per Frank’s blog referenced earlier.
The Content Library view shows that there is no Content Library associated with any namespaces for the purposes of the VM Service (even though you may have a Content Library associated with a namespace for the purposes of deploying TKG clusters). You can also create a new Content Library from this view.
vSphere Admin – Create Content Library, Add Image, Associate with Namespace
At this point, a vSphere admin can go ahead and create a Content Library and add the VM Service image for Centos to it. This is the version that I downloaded and added to my Content Library for the purposes of this blog.
A vSphere admin can now decide if they need to create a new namespace for a developer or if the developer already has a namespace, then the vSphere admin can decide to add the VM Service to this existing namespace. Note that when vSphere with Tanzu is initially deployed, the vSphere admin is responsible for determining which “workload” networks are made available to the different vSphere with Tanzu services, such as the TKG service. These are typically referred to as workload networks, and is where nodes in a Tanzu Kubernetes cluster are provisioned. When the vSphere admin creates a namespace, they are responsible for choosing which workload network is associated with the namespace. This workload network is also the network where the VM Service can provision virtual machines.
Through the VM Service window in the Namespace view Summary tab, a vSphere admin must add the VM Service Content Library and any required VM Classes to the namespace. Developers in this namespace will then have these VM Classes and VM Images available to them so they can create virtual machines. Here is an example of a Content Library and a single VM Class (best-effort-small) that have been added to my namespace cormac-ns.
An important step for the vSphere admin is to determine which storage should be used to back the virtual disks on any VM provisioned by the VM Service. At present, there are no storage policies created, so it is up to the vSphere admin to determine which class of storage should be available to the developers in this namespace. The policies map back to the underlying vSphere storage, which could be vSAN, vVols, VMFS or NFS. One requirement is that the storage is “shared” between all the ESXi hosts which host the vSphere with Tanzu control plane nodes.
Another role for the vSphere admin is to determine whether there should be limits placed around CPU usage, memory usage and disk usage on this namespace. Is there a risk that the developers in this namespace could become “noisy developers” and swamp the infrastructure by creating very many VMs via the VM Service? If so, then it is important to set Limits in the Capacity and Usage section in the namespace.
At this point, the vSphere admin’s initial setup and configuration work is completed, they can now hand off to the developer, assuming the developer wishes to create some virtual machines as part of their overall application development effort. It should be appreciated that there is a considerable amount of deliberation required by the vSphere admin to create an environment that is customized to the needs of the developer, and this may involved numerous conversations between the vSphere admin, DevOps and the developers.
Developer – Query VM Service details
Possibly the first step a developer will take is to use the tools he/she is already familiar with to query the environment. A number of kubectl extensions are provided as part of the VM Service. Let’s look at those first before we build anything. We will assume that the developer has already used kubectl to login to the vSphere with Tanzu Supervisor Cluster and switch contexts to the namespace to which they have access. Let’s look at the nodes and context first.
chogan@chogan-a01 ~ % kubectl get nodes NAME STATUS ROLES AGE VERSION 42251e991e49bed98fd2e34b00d7b5ce Ready master 65m v1.19.1+wcp.3 4225906fc5ac685e2be389d861826958 Ready master 32m v1.19.1+wcp.3 4225bd0fcbb647ebab9b659b6813d5da Ready master 47m v1.19.1+wcp.3 chogan@chogan-a01 ~ % kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 10.202.112.152 10.202.112.152 wcp:10.202.112.152:administrator@vsphere.local * cormac-ns 10.202.112.152 wcp:10.202.112.152:administrator@vsphere.local cormac-ns
Now that the developer is logged into the correct cluster, and the context is set to the correct namespace, they can begin to look at the VM service options.
Developer – Get Virtual Machine Classes
Since we allocated one VM class to this namespace, that is what is returned when we query it as follows.
chogan@chogan-a01 ~ % kubectl get vmclass NAME CPU MEMORY AGE best-effort-small 2 4Gi 3m30s
Developer – Get Virtual Machine Images
With the VM Service Content Library associated with the namespace, the developer should be able to query the available images. Note that since this namespace also has a Tanzu Kubernetes cluster, the TKG service also has a content library associated, and the images from that content library are also listed. The CentOS image from the VM Service content library is listed at the top.
chogan@chogan-a01 ~ % kubectl get vmimage NAME VERSION OSTYPE FORMAT AGE centos-stream-8-vmservice-v1alpha1-1619529007339 centos8_64Guest ovf 2m33s ob-15957779-photon-3-k8s-v1.16.8---vmware.1-tkg.3.60d2ffd v1.16.8+vmware.1-tkg.3.60d2ffd vmwarePhoton64Guest ovf 103m ob-16466772-photon-3-k8s-v1.17.7---vmware.1-tkg.1.154236c v1.17.7+vmware.1-tkg.1.154236c vmwarePhoton64Guest ovf 103m ob-16545581-photon-3-k8s-v1.16.12---vmware.1-tkg.1.da7afe7 v1.16.12+vmware.1-tkg.1.da7afe7 vmwarePhoton64Guest ovf 103m ob-16551547-photon-3-k8s-v1.17.8---vmware.1-tkg.1.5417466 v1.17.8+vmware.1-tkg.1.5417466 vmwarePhoton64Guest ovf 103m ob-16897056-photon-3-k8s-v1.16.14---vmware.1-tkg.1.ada4837 v1.16.14+vmware.1-tkg.1.ada4837 vmwarePhoton64Guest ovf 103m ob-16924026-photon-3-k8s-v1.18.5---vmware.1-tkg.1.c40d30d v1.18.5+vmware.1-tkg.1.c40d30d vmwarePhoton64Guest ovf 103m ob-16924027-photon-3-k8s-v1.17.11---vmware.1-tkg.1.15f1e18 v1.17.11+vmware.1-tkg.1.15f1e18 vmwarePhoton64Guest ovf 103m ob-17010758-photon-3-k8s-v1.17.11---vmware.1-tkg.2.ad3d374 v1.17.11+vmware.1-tkg.2.ad3d374 vmwarePhoton64Guest ovf 103m ob-17332787-photon-3-k8s-v1.17.13---vmware.1-tkg.2.2c133ed v1.17.13+vmware.1-tkg.2.2c133ed vmwarePhoton64Guest ovf 103m ob-17419070-photon-3-k8s-v1.18.10---vmware.1-tkg.1.3a6cd48 v1.18.10+vmware.1-tkg.1.3a6cd48 vmwarePhoton64Guest ovf 103m ob-17654937-photon-3-k8s-v1.18.15---vmware.1-tkg.1.600e412 v1.18.15+vmware.1-tkg.1.600e412 vmwarePhoton64Guest ovf 103m ob-17658793-photon-3-k8s-v1.17.17---vmware.1-tkg.1.d44d45a v1.17.17+vmware.1-tkg.1.d44d45a vmwarePhoton64Guest ovf 103m ob-17660956-photon-3-k8s-v1.19.7---vmware.1-tkg.1.fc82c41 v1.19.7+vmware.1-tkg.1.fc82c41 vmwarePhoton64Guest ovf 103m ob-17861429-photon-3-k8s-v1.20.2---vmware.1-tkg.1.1d4f79a v1.20.2+vmware.1-tkg.1.1d4f79a vmwarePhoton64Guest ovf 103m
Developer – Get Storage Classes / Storage Policies
We mentioned earlier that one of the steps that is the responsibility of the vSphere administrator is to assign storage policies to a namespace. Developers can then query these policies which are instantiated as storage classes in the namespace. In this instance, there are 2 policies available. One is the vSAN default policy, and the other is called r5, short for RAID-5, a space saving policy also available on vSAN. We will shortly see how the developer can use the r5 StorageClass as the policy to build the virtual disks for the VM.
chogan@chogan-a01 ~ % kubectl get storageclass NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE r5 csi.vsphere.vmware.com Delete Immediate true 22h vsan-default-storage-policy csi.vsphere.vmware.com Delete Immediate true 42d
Developer – Get Virtual Machine Network(s)
One final piece of information that a developer will need is to see which networks are available to plumb into the virtual machine. We mentioned previously that the vSphere admin chooses a workload network for the namespace. The developer has a means to query the network via kubectl.
chogan@chogan-a01 ~ % kubectl get network NAME AGE workload-network-1 41d
Developer – cloud-init customization
While the deployment of virtual machines from YAML manifest sounds great, it is not going to be much use if we cannot customize the image. To facilitate the customization of the VM images, cloud-init is used. This has become one of the most common mechanisms to customize Linux images, and there is a wealth of documentation available here. We are using the #cloud-config which is a simple way to do customization by providing what is known as user-data.
Again, I’m not going to do a detailed write-up on cloud-int or user-data as Myles has done a great job covering this already in his blog post. Instead, I will create a simple user-data file which contains some simple entries that grants the developer authority to ssh from his/her desktop to the VM as user centos. The developer is also going to enable DHCP on the VM. The assumption is that there is a DHCP server available on the workload network where this VM is going to be deployed. One final step is to run a command to update the /etc/motd file so that a message is displayed when the developer logs in to the VM.
#cloud-config ssh_pwauth: yes users: - default - name: centos ssh_authorized_keys: - ssh-rsa AAAAB3NzaC1...xxxxxxxxxx sudo: ALL=(ALL) NOPASSWD:ALL groups: sudo shell: /bin/bash network: version: 2 ethernets: ens192: dhcp4: true runcmd: - "echo -e 'Centos VM built by VM Operator on '`date` >> /etc/motd"
The ssh_authorized_keys entry comes from the ~/.ssh/id_pub.rsa in the developers own desktop. To use this user_data with cloud-init, it needs to be rendered to a base64 format. I’ve shorted the actual output here, but normally it would cover many lines.
chogan@chogan-a01 ~% base64 < centos-user-data I2Nsb3VkLWNvbmZpZwpzc2hfcHdhdXRoO...xxxxxxxxxx
This output should now be included in the VM YAML manifest which we will look at next.
Developer – VM Manifest
This is an example of a manifest which a developer might create to deploy a virtual machine via kubectl. It is written in YAML, yet another markup language. It has two objects that it creates, a VirtualMachine and a ConfigMap. The ConfigMap holds the base64 user-data for cloud-init customization that we created in the previous step. Let’s look at some of the other interesting fields in this manifest.
apiVersion: vmoperator.vmware.com/v1alpha1 kind: VirtualMachine metadata: name: centos-vm namespace: cormac-ns spec: networkInterfaces: - networkName: "workload-network-1" networkType: vsphere-distributed className: best-effort-small imageName: centos-stream-8-vmservice-v1alpha1-1619529007339 powerState: poweredOn storageClass: r5 vmMetadata: configMapName: centos-vm-cfm transport: OvfEnv --- apiVersion: v1 kind: ConfigMap metadata: name: centos-vm-cfm namespace: cormac-ns data: user-data: | I2Nsb3VkLWNvbmZpZwpzc2hfcHdhdXRoO...xxxxxxxxxx+IC9ldGMvbW90ZCIK hostname: centos-vm
Let’s begin with the namespace. This virtual machine is being provisioned in the cormac-ns namespace. Most likely, the developer will only have access to a few namespaces, or maybe only one. They will need to set this appropriately. In the spec section, the networking has entries for the name of the network, workload-network-1, and the type of network, vsphere-distributed. There is no NSX-T in this deployment. As mentioned in the introduction, this is a vSphere Distributed Switch environment which uses the NSX ALB for a load balancer.
We already saw how the only VM Class added to this namespace was best-effort-small and that the only image in the content library was the Centos 8 image. Both of these items are added to the manifest. The storage class refers to the storage policy that was added to the namespace by the vSphere admin. In this example, the developer has decided to use r5, a RAID-5 policy for vSAN. The allocation of storage policies to a namespace to form storage classes would have been the responsibility of the vSphere admin as highlighted earlier. The user-data is the base64 encoded data generated in the previous step. The final interesting piece is the hostname in the ConfigMap. Everything is in place for the developer to deploy the VM.
Developer – Deploy and Test
Let’s deploy the manifest with the VM and the ConfigMap first.
chogan@chogan-a01 ~% kubectl apply -f centos-vm.yaml virtualmachine.vmoperator.vmware.com/centos-vm created configmap/centos-vm-cfm created
The developer should shortly be able to query virtual machines from kubectl. Note that since I already have a Tanzu Kubernetes cluster deployed, these VMs will also be listed.
chogan@chogan-a01 ~% kubectl get vm NAME POWERSTATE AGE centos-vm poweredOn 4m42s tkg-cluster-1-19-7-control-plane-psktw poweredOn 42d tkg-cluster-1-19-7-workers-x28f7-746966b6f7-jjcrn poweredOn 42d tkg-cluster-1-19-7-workers-x28f7-746966b6f7-zblpw poweredOn 42d
The next step would be to ssh to the newly provisioned VM, assuming it was successfully allocated an IP address by the DHCP server on the workload network. The IP address of the VM can be discovered in a few different ways. Here is one such way.
chogan@chogan-a01 ~% kubectl get vm centos-vm -o jsonpath='{.status.vmIp}' 10.202.112.174%
The developer should be able to ssh to that IP address, and not have to provide any password since we have provided the developer’s public key to as part of the user-data. Other things we specified were that there should be a login message displayed from an updated /etc/motd, that this user should be able to sudo, and that the hostname should be set to centos-vm. Let’s check if these worked.
chogan@chogan-a01 ~% ssh centos@10.202.112.169 The authenticity of host '10.202.112.169 (10.202.112.169)' can't be established. ECDSA key fingerprint is SHA256:80Mx2li6CNDouvD0vpuSmrEbSdjOgJz1QleE71oddb0. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '10.202.112.169' (ECDSA) to the list of known hosts. Centos VM built by VM Operator on Wed May 5 06:14:04 EDT 2021 [centos@centos-vm ~]$ sudo su - Last login: Tue Feb 23 16:03:20 EST 2021 on tty1 [root@centos-vm ~]# hostname centos-vm [root@centos-vm ~]#
Everything appears to be working as expected. The developer should be pretty happy. I think you’ll agree that this is a pretty cool feature of vSphere with Tanzu.
vSphere Admin – Manage and Monitoring
Let’s close by returning to the role of the vSphere administrator. The vSphere admin will obviously be responsible for managing and monitoring the infrastructure, including resource usage, and will want to know how the developers are using (or misusing) it. The first point to highlight is that the vSphere admin has full visibility into the namespace of the developer and can see VM’s resources via the vSphere client UI, as shown here:
Note that this is a Developer Managed VM, and most of the actions associated with the VM, such as power on/off, migrate, etc., are prohibited in the vSphere client. The lifecycle of the VM is managed by the developer.
There is also visibility from the Workload Management perspective, where the vSphere administrator can see VM Classes, Content Libraries and VMs as well. These field have now been updated to reflect how developers are using this feature in the cluster.
Hopefully this post has now given you a good insight into how vSphere with Tanzu is evolving, and how the VM Service can extend the ways in which developers can create applications that are made up of both VMs and Container workloads.