vtopology – Insights into vSphere infrastructure from kubectl
As I got more and more familiar with running Kubernetes on top of vSphere, I came to the realization that it might be useful to be able to query the vSphere Infrastructure from Kubernetes, particularly via kubectl. For example, I might like to know some of the details about the master nodes and worker nodes (e.g. which ESXi host are they on?, how much resources are they consuming?). Also, if I have a persistent volume, how can I query which vSphere datastore is it on, which policy is it using, what is the path to the VMDK? Therefore I started work on a tool (vtopology) that would allow me to retrieve this information without having to log onto the vSphere Client every time I wanted to look at this information.
vtopology is a combination of bash and Powershell/PowerCLI for displaying vSphere topology from kubectl. The idea is that you should be able to map Kubernetes objects (e.g. nodes, PVs) to vSphere objects (e.g. virtual machines, VMDKs). Once installed, users can run vtopology and display underlying vSphere infrastructure components to see how their Kubernetes cluster is consuming vSphere resources.
Both PowerShell and PowerCLI are required. Deployment instructions for PowerShell and PowerCLI on Ubuntu can be found here. The instructions are for Ubuntu 16.04 so simply modify the instructions slightly to point to the correct repository for your OS version. I have successfully used the same steps to deploy Ubuntu 17.04. This tool has been tested and validated on Ubuntu 17.04.
While vtopology can be used as a simple powershell script, the tool has also been configured so that it can be run as a krew plugin. This means that users can run the tool as a ‘kubectl vtopology’ command. Here is more information on how to install krew. Once krew is installed, you can install the vtopology plugin as follows:
$ kubectl krew install --manifest=vtopology.yaml --archive=vtopology.tar.gz Installing plugin: vtopology CAVEATS: \ | This plugin needs the following programs: | * PowerShell and PowerCLI / Installed plugin: vtopology
Once installed, the tool can be run directly from kubectl:
$ kubectl vtopology -h Usage: kubectl vtopology <connect-args> <args> where connect-args (optionally) includes the following: -vc | --vcenter -u | --username -p | --password and where args is one of the following: -e | --hosts -v | --vms -n | --networks -d | --datastores -k | --k8svms -s | --spbm -t | --tags -a | --all -h | --help Advanced args -pv <pv_id> - display vSphere storage details about a Persistent Volume -kn <node_name> - display vSphere VM details about a Kubernetes node -sp <policy> - display details of storage policy Note this tool requires PowerShell with PowerCLI, as well as kubectl
Here are some examples of where vtopology could be useful.
I want to scale out my K8s cluster. What vSphere resources are available?
I already have a Kubernetes cluster deployed, made up of a number of master nodes and worker nodes. I wish to scale out the K8s cluster to have additional worker nodes. Can I check the usage of the current vSphere environment and see if there are available resources?
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p ****. -e *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** Found DataCenter: CH-Datacenter Found Cluster: CH-Cluster Found ESXi HOST: esxi-dell-g.rainpole.com vSphere Version : 6.7.0 Build Number : 14320388 Connection State : Connected Power State : PoweredOn Manufacturer : Dell Inc. Model : PowerEdge R630 Number of CPU : 20 Total CPU (MHz) : 43980 CPU Used (MHz) : 4236 Total Memory (GB) : 127.91 Memory Used (GB) : 78.55 ESXi HOST esxi-dell-g.rainpole.com is part of Host Group PKS-AZ-1 Found ESXi HOST: esxi-dell-h.rainpole.com vSphere Version : 6.7.0 Build Number : 14320388 Connection State : Connected Power State : PoweredOn Manufacturer : Dell Inc. Model : PowerEdge R630 Number of CPU : 20 Total CPU (MHz) : 43980 CPU Used (MHz) : 4439 Total Memory (GB) : 127.91 Memory Used (GB) : 100.28 ESXi HOST esxi-dell-h.rainpole.com is part of Host Group PKS-AZ-1 Found ESXi HOST: esxi-dell-f.rainpole.com vSphere Version : 6.7.0 Build Number : 14320388 Connection State : Connected Power State : PoweredOn Manufacturer : Dell Inc. Model : PowerEdge R630 Number of CPU : 20 Total CPU (MHz) : 43980 CPU Used (MHz) : 2003 Total Memory (GB) : 127.91 Memory Used (GB) : 85.69 ESXi HOST esxi-dell-f.rainpole.com is part of Host Group PKS-AZ-1 Found ESXi HOST: esxi-dell-e.rainpole.com vSphere Version : 6.7.0 Build Number : 14320388 Connection State : Connected Power State : PoweredOn Manufacturer : Dell Inc. Model : PowerEdge R630 Number of CPU : 20 Total CPU (MHz) : 43980 CPU Used (MHz) : 1674 Total Memory (GB) : 127.91 Memory Used (GB) : 113.75 ESXi HOST esxi-dell-e.rainpole.com is part of Host Group PKS-AZ-2
As you can see, I also report whether or not the host is part of a Host Group. This could be useful in Enterprise PKS where Host Groups can now be integrated with Availability Zones to determine placement of Kubernetes components. You may need to add additional ESXi hosts to an Availability Zone to extend the number of worker nodes, whilst ensuring a failure (e.g. rack power outage) does not impact an application. More on Enterprise PKS and Host Groups can be found here.
I want to deploy another app. What worker resources are available?
There are some kubectl commands to query the nodes, but they do not give a lot of detail from a resource perspective.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master Ready master 136d v1.14.2 k8s-worker1 Ready <none> 136d v1.14.2 k8s-worker2 Ready <none> 136d v1.14.2 k8s-worker3 Ready <none> 14d v1.14.2 k8s-worker4 Ready <none> 14d v1.14.2 k8s-worker5 Ready <none> 14d v1.14.2
There is an option to get extended information:
$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-master Ready master 136d v1.14.2 10.27.51.39 10.27.51.39 Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://18.6.0 k8s-worker1 Ready <none> 136d v1.14.2 10.27.51.40 10.27.51.40 Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.6.0 k8s-worker2 Ready <none> 136d v1.14.2 10.27.51.41 10.27.51.41 Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.6.0 k8s-worker3 Ready <none> 14d v1.14.2 10.27.51.31 10.27.51.31 Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://18.6.0 k8s-worker4 Ready <none> 14d v1.14.2 10.27.51.32 10.27.51.32 Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://18.6.0 k8s-worker5 Ready <none> 14d v1.14.2 10.27.51.30 10.27.51.30 Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://18.6.0
We could use vtopology to get more information about the nodes using the -k option.
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -k *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** Kubernetes Node VM Name : k8s-master IP Address : 10.27.51.39 Power State : PoweredOn On ESXi host : esxi-dell-h.rainpole.com Folder : ubuntu64Guest Hardware Version : vmx-10 Number of CPU : 4 Cores per Socket : 1 Memory (GB) : 4 Provisioned Space (GB) : 114.55 Used Space (GB) : 114.55 Kubernetes Node VM Name : k8s-worker1 IP Address : 10.27.51.40 Power State : PoweredOn On ESXi host : esxi-dell-e.rainpole.com Folder : ubuntu64Guest Hardware Version : vmx-15 Number of CPU : 4 Cores per Socket : 1 Memory (GB) : 4 Provisioned Space (GB) : 75.62 Used Space (GB) : 75.62 Kubernetes Node VM Name : k8s-worker2 IP Address : 10.27.51.41 Power State : PoweredOn On ESXi host : esxi-dell-h.rainpole.com Folder : ubuntu64Guest Hardware Version : vmx-15 Number of CPU : 4 Cores per Socket : 1 Memory (GB) : 4 Provisioned Space (GB) : 88.78 Used Space (GB) : 88.78 etc, etc.
If the nodes are part of a VM/Host Group, this would also be reported in the output above.
My app requires Persistent Storage. What datastores are available?
Many applications require Persistent Storage. But which datastores are available for provisioning of volumes? And how much space is available on them? vtopology can help here too, using the -d option.
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -d *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** Found Datastore: 5TB-VMFS6 State : Available Datastore Type : VMFS Capacity (GB) : 4,998.75 Free Space (GB) : 3,514.90 Connected hosts : esxi-dell-g.rainpole.com esxi-dell-h.rainpole.com esxi-dell-f.rainpole.com esxi-dell-e.rainpole.com Found Datastore: pksvol State : Available Datastore Type : VMFS Capacity (GB) : 999.75 Free Space (GB) : 519.66 Connected hosts : esxi-dell-g.rainpole.com esxi-dell-h.rainpole.com esxi-dell-f.rainpole.com esxi-dell-e.rainpole.com Found Datastore: pure-iscsi-vol1 State : Unavailable Datastore Type : VMFS Capacity (GB) : 99.75 Free Space (GB) : 48.34 Connected hosts : esxi-dell-h.rainpole.com Found Datastore: vsanDatastore State : Available Datastore Type : vsan Capacity (GB) : 5,961.63 Free Space (GB) : 3,574.25 Connected hosts : esxi-dell-g.rainpole.com esxi-dell-h.rainpole.com esxi-dell-f.rainpole.com esxi-dell-e.rainpole.com Found Datastore: vVolDatastore State : Available Datastore Type : VVOL Capacity (GB) : 8,388,608.00 Free Space (GB) : 8,388,538.38 Connected hosts : esxi-dell-g.rainpole.com esxi-dell-h.rainpole.com esxi-dell-f.rainpole.com esxi-dell-e.rainpole.com etc, etc.
The ‘connected’ status is useful as it tells you which ESXi hosts have access to the datastore, i.e. is it shared?
Let’s stay on the topic of storage. Let’s say that as a K8s cluster admin, I want to create some StorageClasses. When K8s is running on vSphere, StorageClasses can consume Storage Policies built on vSphere. However, how do I know what Storage Policies are available? How can I get the details around the Storage Policies?
How can I get details about vSphere Storage Policies?
Let’s start by getting information about the various Storage Policies that can then be incorporated into a StorageClass. We can use the -s option for this.
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -s *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** *** These are Storage Policies in use on the vSphere Infrastructure which could potentially be used for Kuberenetes StorageClasses Found Policy: VVol No Requirements Policy Found Policy: RAID-5 Found Policy: Space-Efficient1 Found Policy: VM Encryption Policy Found Policy: vvol-simple Found Policy: vvol-snaps Found Policy: raid-1 Found Policy: silver Found Policy: OSR-0 Found Policy: gold Found Policy: OSR-100 Found Policy: Host-local PMem Default Storage Policy Found Policy: Space-Efficient Found Policy: vSAN Default Storage Policy Found Policy: bronze Found Policy: VMcrypt
Now that we have the list of policies, we can query each of them using the advanced option -sp.
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -sp raid-1 *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** Display Detailed Policy attributes of: raid-1 Found Policy Attribute : (VSAN.hostFailuresToTolerate=1) AND \ (VSAN.replicaPreference=RAID-1 (Mirroring) - Performance) AND \ (VSAN.checksumDisabled=False) AND (VSAN.stripeWidth=1) AND \ (VSAN.forceProvisioning=False) AND (VSAN.iopsLimit=0) AND \ (VSAN.cacheReservation=0) AND (VSAN.proportionalCapacity=0)
Here is another example, but this time querying a vVols datastore. Please note that at the time of writing, vVols is not supported with the vSphere CSI driver. However, it is something we hope to have supported very soon.
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -sp vvol-snaps *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** Display Detailed Policy attributes of: vvol-snaps Found Policy Attribute : (com.purestorage.storage.policy.PureFlashArray=True) AND \ (com.purestorage.storage.replication.LocalSnapshotPolicyCapable=True) AND \ (com.purestorage.storage.replication.LocalSnapshotInterval=01:00:00) AND \ (com.purestorage.storage.replication.LocalSnapshotRetention=7.00:00:00)
Now lets assume that we need to troubleshoot a PV (persistent volume) issue. How can you map a PV to an actual VMDK on vSphere to do some further investigation? Which datastore is it on? What policy was used to create it? Let’s see how vtopology can help here.
How can I determine the VMDK that backs a PV?
We can already get some information from the PV and PVC outputs. We can even describe them for further information. However, there is no detail about the VMDK used to back the PV in vSphere in these outputs.
$ kubectl get pvc -n cassandra NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandra-data-cassandra-0 Bound pvc-e2989b44-2e27-11ea-80e4-005056a239d9 1Gi RWO cass-sc-csi 3m28s cassandra-data-cassandra-1 Bound pvc-01936124-2e28-11ea-80e4-005056a239d9 1Gi RWO cass-sc-csi 2m36s cassandra-data-cassandra-2 Bound pvc-3bdd199d-2e28-11ea-80e4-005056a239d9 1Gi RWO cass-sc-csi 58s $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-01936124-2e28-11ea-80e4-005056a239d9 1Gi RWO Delete Bound cassandra/cassandra-data-cassandra-1 cass-sc-csi 2m28s pvc-3bdd199d-2e28-11ea-80e4-005056a239d9 1Gi RWO Delete Bound cassandra/cassandra-data-cassandra-2 cass-sc-csi 56s pvc-e2989b44-2e27-11ea-80e4-005056a239d9 1Gi RWO Delete Bound cassandra/cassandra-data-cassandra-0 cass-sc-csi 3m33s $ kubectl describe pvc cassandra-data-cassandra-0 -n cassandra Name: cassandra-data-cassandra-0 Namespace: cassandra StorageClass: cass-sc-csi Status: Bound Volume: pvc-e2989b44-2e27-11ea-80e4-005056a239d9 Labels: app=cassandra Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-class: cass-sc-csi volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 1Gi Access Modes: RWO VolumeMode: Filesystem Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ExternalProvisioning 7m52s (x2 over 7m52s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator Normal Provisioning 7m52s csi.vsphere.vmware.com_vsphere-csi-controller-0_2dacb016-230d-11ea-b0c6-9e255ee28255 External provisioner is provisioning volume for claim "cassandra/cassandra-data-cassandra-0" Mounted By: cassandra-0 $ kubectl describe pv pvc-01936124-2e28-11ea-80e4-005056a239d9 Name: pvc-01936124-2e28-11ea-80e4-005056a239d9 Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com Finalizers: [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com] StorageClass: cass-sc-csi Status: Bound Claim: cassandra/cassandra-data-cassandra-1 Reclaim Policy: Delete Access Modes: RWO VolumeMode: Filesystem Capacity: 1Gi Node Affinity: <none> Message: Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: csi.vsphere.vmware.com VolumeHandle: 8fe0fc5b-c42b-42f5-bde0-3ec992d45d15 ReadOnly: false VolumeAttributes: fstype= storage.kubernetes.io/csiProvisionerIdentity=1576835009605-8081-csi.vsphere.vmware.com type=vSphere CNS Block Volume Events: <none>
Let’s now see what vSphere information vtopology can report about the persistent volume:
$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -pv pvc-01936124-2e28-11ea-80e4-005056a239d9 *** This command is being run against the following Kubernetes configuration context: cork8s-csi-01 *** To switch to another context, use the kubectl config use-context command *** === vSphere Datastore information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 === Datastore Name : vsanDatastore Datastore State : Available Datastore Type : vsan Capacity (GB) : 5961.625 Free Space (GB) : 3574.25 === Virtual Machine Disk (VMDK) information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 === VMDK Name : d7f9cc0f8c614c97b579c6af8d0ffc87.vmdk VMDK Type : Flat VMDK Capacity (GB) : 1 VMDK Filename : [vsanDatastore] 33d05a5d-e436-3297-94f7-246e962f4910/d7f9cc0f8c614c97b579c6af8d0ffc87.vmdk === Storage Policy (SPBM) information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 === Kubernetes VM/Node : k8s-worker4 Hard Disk Name : Hard disk 2 Policy Name : raid-1 Policy Compliance : compliant
Not bad. Now we can see the datastore, the policy and event the path to the VMDK file. This should work for K8s clusters using the original VCP driver (e.g. Enterprise PKS) and the newer clusters using the CSI driver.
Some of the other items in vtopology are a work in progress. I’m not sure what useful information I could display from a networking perspective just yet. Ideas welcomed. Now for those of us who have access to both vSphere and Kubernetes, switching contexts between the two is probably second nature. But for those K8s cluster administrators who may not have access to vSphere, vtopology could be very useful (at least in my opinion).
Disclaimer
vtopology has only been tested against natively deployed Kubernetes clusters (using kubeadm) on vSphere, as well as Enterprise PKS. It has not been tested against any RedHat OpenShift deployments running on vSphere, or Google Anthos on-prem deployments. If anyone has the chance to try vtopology out on those platforms, I’d be very interested to hear how it goes.
Note that this is not VMware code – it is my own code. Therefore the content is provided “as-is”. This means that the code is used at the users own risk, and is provided without warranty, and no liability for damages resulting from using the code. The code can also change at any time as new updates are added.
Where can I get it?
vtopology is available on GitHub – you can get it here: https://github.com/cormachogan/vtopology
Feedback
Let me know what you think. What can be done to improve it? What other parts of the vSphere infrastructure would be useful to query from vtopology?
That’s awesome !
Thank you Cormac.