Using a Kubernetes Operator to query vSphere Resources
As many regular readers will be aware, I’ve spent a bit of time in the past looking at how vSphere resources are consumed by Kubernetes objects, when Kubernetes is deployed as a set of virtual machines on top of vSphere infrastructure. While much of this is visible in the vSphere client, I’m focused on how to see this vSphere resource consumption from within Kubernetes. If I am working in Kubernetes, I’d rather not context switch out to the vSphere client just to see how much storage is left on a datastore or how much CPU and Memory is left on an ESXi host.
Some time back, I started work on vTopology, which allows me to plugin a Shell/PowerShell script into a mechanism called krew and run it from kubectl to get some information. However it is a little cumbersome to get all the pieces in place. So I began looking at alternate ways in which I could achieve the same thing without requiring any external dependencies. Kubernetes Customer Resource Definitions (CRDs) and Operators seem to be universally recognized as the de-facto way to extend Kubernetes. Thus, I started to look at how I might be able to create a CRD and operator to query for vSphere objects such as HostInfo, or DiskInfo, or VMInfo and use these to query underlying vSphere resources from kubectl.
As a proof-of-concept, I built a very simple CRD and Operator which returns the TotalCPU and FreeCPU of an ESXi host. I learnt so much from trying to do this exercise that I decided to write up the steps on GitHub and share them with you. If you are looking to learn more about Kubernetes CRD and Operators, and are interested in how to get it to interact with VMware’s govmomi APIs provided by VMware for vSphere, you might like to check it out. It is a long way off from providing all of the detail of the underlying infrastructure which I currently have in vTopology today, but maybe over time I’ll be able to add more features.
The complete code for the operator and CRD can be found here on my GitHub repository: https://github.com/cormachogan/hostinfo-operator, along with step by step instructions on how to deploy it on your own Kubernetes cluster. Hope you find it useful. This was updated [18th Jan 2021] to move the vSphere login out of the Reconciler code and into main.go to avoid calling vSphere on every reconcile. Here is a sample output which contains CPU usage information in the status:
$ kubectl get hi -o yaml apiVersion: v1 items: - apiVersion: topology.corinternal.com/v1 kind: HostInfo metadata: creationTimestamp: "2021-01-18T14:15:07Z" generation: 1 managedFields: - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:hostname: {} manager: kubectl operation: Update time: "2021-01-18T14:15:07Z" - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:freeCPU: {} f:totalCPU: {} manager: manager operation: Update time: "2021-01-18T14:31:00Z" name: hostinfo-host-e namespace: default resourceVersion: "28883011" selfLink: /apis/topology.corinternal.com/v1/namespaces/default/hostinfoes/hostinfo-host-e uid: 720a91bb-8929-4120-8ba9-d652c884f9ed spec: hostname: esxi-dell-e.rainpole.com status: freeCPU: 41238 totalCPU: 43980 kind: List metadata: resourceVersion: ""
I also created an operator to retrieve virtual machine information. You also find it on GitHub, here: https://github.com/cormachogan/vminfo-operator. Again, you can see the sorts of VM information that we can pull via the operator in the status fields.
$ kubectl get vminfo -o yaml apiVersion: v1 items: - apiVersion: topology.corinternal.com/v1 kind: VMInfo metadata: creationTimestamp: "2021-01-18T12:20:45Z" generation: 1 managedFields: - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:nodename: {} manager: kubectl operation: Update time: "2021-01-18T12:20:45Z" - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:guestId: {} f:hwVersion: {} f:ipAddress: {} f:pathToVM: {} f:powerState: {} f:resvdCPU: {} f:resvdMem: {} f:totalCPU: {} f:totalMem: {} manager: manager operation: Update time: "2021-01-18T12:20:46Z" name: tkg-worker-1 namespace: default resourceVersion: "28841720" selfLink: /apis/topology.corinternal.com/v1/namespaces/default/vminfoes/tkg-worker-1 uid: 2c60b273-a866-4344-baf5-0b3b924b65a5 spec: nodename: tkg-cluster-1-18-5b-workers-kc5xn-dd68c4685-5v298 status: guestId: vmwarePhoton64Guest hwVersion: vmx-17 ipAddress: 10.27.62.45 pathToVM: '[vsanDatastore] 4d56b55f-11db-8822-6463-246e962f4914/tkg-cluster-1-18-5b-workers-kc5xn-dd68c4685-5v298.vmx' powerState: poweredOn resvdCPU: 0 resvdMem: 0 totalCPU: 2 totalMem: 4096 kind: List metadata: resourceVersion: "" selfLink: ""
My final exercise was to create a tutorial on how to get FCD information. FCDs, short for First Class Disks, are used to back Kubernetes Persistent Volumes when these are deployed on vSphere Storage using the vSphere CSI driver. The operator is here: https://github.com/cormachogan/fcdinfo-operator. Here is some of the information we can get for the PV / FCD, such as the path to the file, and the provisioning type (thick/thin):
$ kubectl get fcd -o yaml apiVersion: v1 items: - apiVersion: topology.corinternal.com/v1 kind: FCDInfo metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"topology.corinternal.com/v1","kind":"FCDInfo","metadata":{"annotations":{},"name":"fcdinfo-sample","namespace":"default"},"spec":{"pvId":"pvc-e3f6dd59-cbc0-49a7-97c8-d92a26732c43"}} creationTimestamp: "2021-01-26T10:43:21Z" generation: 1 managedFields: - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:kubectl.kubernetes.io/last-applied-configuration: {} f:spec: .: {} f:pvId: {} manager: kubectl operation: Update time: "2021-01-26T10:43:21Z" - apiVersion: topology.corinternal.com/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:filePath: {} f:provisioningType: {} f:sizeMB: {} manager: manager operation: Update time: "2021-01-26T10:43:22Z" name: fcdinfo-sample namespace: default resourceVersion: "32818807" selfLink: /apis/topology.corinternal.com/v1/namespaces/default/fcdinfoes/fcdinfo-sample uid: 5d51788d-fc1b-441f-be11-723d02c87b4b spec: pvId: pvc-e3f6dd59-cbc0-49a7-97c8-d92a26732c43 status: filePath: '[vsanDatastore] 038f6b5f-8122-d3af-eabe-246e962c240c/b39bcacc6ff143439f9cd6b7454999e4.vmdk' provisioningType: thin sizeMB: 1024 kind: List metadata: resourceVersion: "" selfLink: ""
I learnt loads from building these operators. I hope you find the tutorials useful, both from an operator and a govmomi persepctive.