vtopology – Insights into vSphere infrastructure from kubectl

As I got more and more familiar with running Kubernetes on top of vSphere, I came to the realization that it might be useful to be able to query the vSphere Infrastructure from Kubernetes, particularly via kubectl. For example, I might like to know some of the details about the master nodes and worker nodes (e.g. which ESXi host are they on?, how much resources are they consuming?). Also, if I have a persistent volume, how can I query which vSphere datastore is it on, which policy is it using, what is the path to the VMDK? Therefore I started work on a tool (vtopology) that would allow me to retrieve this information without having to log onto the vSphere Client every time I wanted to look at this information.

vtopology is a combination of bash and Powershell/PowerCLI for displaying vSphere topology from kubectl. The idea is that you should be able to map Kubernetes objects (e.g. nodes, PVs) to vSphere objects (e.g. virtual machines, VMDKs). Once installed, users can run vtopology and display underlying vSphere infrastructure components to see how their Kubernetes cluster is consuming vSphere resources.

Both PowerShell and PowerCLI are required. Deployment instructions for PowerShell and PowerCLI on Ubuntu can be found here. The instructions are for Ubuntu 16.04 so simply modify the instructions slightly to point to the correct repository for your OS version. I have successfully used the same steps to deploy Ubuntu 17.04. This tool has been tested and validated on Ubuntu 17.04.

While vtopology can be used as a simple powershell script, the tool has also been configured so that it can be run as a krew plugin. This means that users can run the tool as a ‘kubectl vtopology’ command. Here is more information on how to install krew. Once krew is installed, you can install the vtopology plugin as follows:

$ kubectl krew install --manifest=vtopology.yaml --archive=vtopology.tar.gz 
Installing plugin: vtopology 
CAVEATS: 
\ 
 | This plugin needs the following programs: 
 | * PowerShell and PowerCLI 
/ 
Installed plugin: vtopology

Once installed, the tool can be run directly from kubectl:

$ kubectl vtopology -h

Usage: kubectl vtopology <connect-args> <args>

where connect-args (optionally) includes the following:
-vc | --vcenter
-u | --username
-p | --password

and where args is one of the following:
-e | --hosts
-v | --vms
-n | --networks
-d | --datastores
-k | --k8svms
-s | --spbm
-t | --tags
-a | --all
-h | --help

Advanced args
-pv <pv_id> - display vSphere storage details about a Persistent Volume
-kn <node_name> - display vSphere VM details about a Kubernetes node
-sp <policy> - display details of storage policy

Note this tool requires PowerShell with PowerCLI, as well as kubectl

Here are some examples of where vtopology could be useful.

I want to scale out my K8s cluster. What vSphere resources are available?

I already have a Kubernetes cluster deployed, made up of a number of master nodes and worker nodes. I wish to scale out the K8s cluster to have additional worker nodes. Can I check the usage of the current vSphere environment and see if there are available resources?

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p ****. -e

*** This command is being run against the following Kubernetes configuration context: cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***

Found DataCenter: CH-Datacenter

  Found Cluster: CH-Cluster

    Found ESXi HOST: esxi-dell-g.rainpole.com

    vSphere Version : 6.7.0
    Build Number : 14320388
    Connection State : Connected
    Power State : PoweredOn
    Manufacturer : Dell Inc.
    Model : PowerEdge R630
    Number of CPU : 20
    Total CPU (MHz) : 43980
    CPU Used (MHz) : 4236
    Total Memory (GB) : 127.91
    Memory Used (GB) : 78.55

  ESXi HOST esxi-dell-g.rainpole.com is part of Host Group PKS-AZ-1

  Found ESXi HOST: esxi-dell-h.rainpole.com

    vSphere Version : 6.7.0
    Build Number : 14320388
    Connection State : Connected
    Power State : PoweredOn
    Manufacturer : Dell Inc.
    Model : PowerEdge R630
    Number of CPU : 20
    Total CPU (MHz) : 43980
    CPU Used (MHz) : 4439
    Total Memory (GB) : 127.91
    Memory Used (GB) : 100.28

  ESXi HOST esxi-dell-h.rainpole.com is part of Host Group PKS-AZ-1

  Found ESXi HOST: esxi-dell-f.rainpole.com

    vSphere Version : 6.7.0
    Build Number : 14320388
    Connection State : Connected
    Power State : PoweredOn
    Manufacturer : Dell Inc.
    Model : PowerEdge R630
    Number of CPU : 20
    Total CPU (MHz) : 43980
    CPU Used (MHz) : 2003
    Total Memory (GB) : 127.91
    Memory Used (GB) : 85.69

  ESXi HOST esxi-dell-f.rainpole.com is part of Host Group PKS-AZ-1

  Found ESXi HOST: esxi-dell-e.rainpole.com

    vSphere Version : 6.7.0
    Build Number : 14320388
    Connection State : Connected
    Power State : PoweredOn
    Manufacturer : Dell Inc.
    Model : PowerEdge R630
    Number of CPU : 20
    Total CPU (MHz) : 43980
    CPU Used (MHz) : 1674
    Total Memory (GB) : 127.91
    Memory Used (GB) : 113.75

  ESXi HOST esxi-dell-e.rainpole.com is part of Host Group PKS-AZ-2

As you can see, I also report whether or not the host is part of a Host Group. This could be useful in Enterprise PKS where Host Groups can now be integrated with Availability Zones to determine placement of Kubernetes components. You may need to add additional ESXi hosts to an Availability Zone to extend the number of worker nodes, whilst ensuring a failure (e.g. rack power outage) does not impact an application. More on Enterprise PKS and Host Groups can be found here.

I want to deploy another app. What worker resources are available?

There are some kubectl commands to query the nodes, but they do not give a lot of detail from a resource perspective.

$ kubectl get nodes
NAME          STATUS   ROLES    AGE    VERSION
k8s-master    Ready    master   136d   v1.14.2
k8s-worker1   Ready    <none>   136d   v1.14.2
k8s-worker2   Ready    <none>   136d   v1.14.2
k8s-worker3   Ready    <none>   14d    v1.14.2
k8s-worker4   Ready    <none>   14d    v1.14.2
k8s-worker5   Ready    <none>   14d    v1.14.2

There is an option to get extended information:

$ kubectl get nodes -o wide
NAME          STATUS   ROLES    AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k8s-master    Ready    master   136d   v1.14.2   10.27.51.39   10.27.51.39   Ubuntu 18.04.3 LTS   4.15.0-58-generic   docker://18.6.0
k8s-worker1   Ready    <none>   136d   v1.14.2   10.27.51.40   10.27.51.40   Ubuntu 18.04.3 LTS   4.15.0-72-generic   docker://18.6.0
k8s-worker2   Ready    <none>   136d   v1.14.2   10.27.51.41   10.27.51.41   Ubuntu 18.04.3 LTS   4.15.0-72-generic   docker://18.6.0
k8s-worker3   Ready    <none>   14d    v1.14.2   10.27.51.31   10.27.51.31   Ubuntu 18.04.3 LTS   4.15.0-58-generic   docker://18.6.0
k8s-worker4   Ready    <none>   14d    v1.14.2   10.27.51.32   10.27.51.32   Ubuntu 18.04.3 LTS   4.15.0-58-generic   docker://18.6.0
k8s-worker5   Ready    <none>   14d    v1.14.2   10.27.51.30   10.27.51.30   Ubuntu 18.04.3 LTS   4.15.0-58-generic   docker://18.6.0

We could use vtopology to get more information about the nodes using the -k option.

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -k

*** This command is being run against the following Kubernetes configuration context: cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***


Kubernetes Node VM Name : k8s-master

    IP Address : 10.27.51.39
    Power State : PoweredOn
    On ESXi host : esxi-dell-h.rainpole.com
    Folder : ubuntu64Guest
    Hardware Version : vmx-10
    Number of CPU : 4
    Cores per Socket : 1
    Memory (GB) : 4
    Provisioned Space (GB) : 114.55
    Used Space (GB) : 114.55


Kubernetes Node VM Name : k8s-worker1

    IP Address : 10.27.51.40
    Power State : PoweredOn
    On ESXi host : esxi-dell-e.rainpole.com
    Folder : ubuntu64Guest
    Hardware Version : vmx-15
    Number of CPU : 4
    Cores per Socket : 1
    Memory (GB) : 4
    Provisioned Space (GB) : 75.62
    Used Space (GB) : 75.62


Kubernetes Node VM Name : k8s-worker2

    IP Address : 10.27.51.41
    Power State : PoweredOn
    On ESXi host : esxi-dell-h.rainpole.com
    Folder : ubuntu64Guest
    Hardware Version : vmx-15
    Number of CPU : 4
    Cores per Socket : 1
    Memory (GB) : 4
    Provisioned Space (GB) : 88.78
    Used Space (GB) : 88.78

etc, etc.

If the nodes are part of a VM/Host Group, this would also be reported in the output above.

My app requires Persistent Storage. What datastores are available?

Many applications require Persistent Storage. But which datastores are available for provisioning of volumes? And how much space is available on them? vtopology can help here too, using the -d option.

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -d

*** This command is being run against the following Kubernetes configuration context: cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***


Found Datastore: 5TB-VMFS6
        State            :  Available
        Datastore Type   :  VMFS
        Capacity (GB)    :  4,998.75
        Free Space (GB)  :  3,514.90
        Connected hosts :
                 esxi-dell-g.rainpole.com
                 esxi-dell-h.rainpole.com
                 esxi-dell-f.rainpole.com
                 esxi-dell-e.rainpole.com


Found Datastore: pksvol
        State            :  Available
        Datastore Type   :  VMFS
        Capacity (GB)    :  999.75
        Free Space (GB)  :  519.66
        Connected hosts :
                 esxi-dell-g.rainpole.com
                 esxi-dell-h.rainpole.com
                 esxi-dell-f.rainpole.com
                 esxi-dell-e.rainpole.com


Found Datastore: pure-iscsi-vol1
        State            :  Unavailable
        Datastore Type   :  VMFS
        Capacity (GB)    :  99.75
        Free Space (GB)  :  48.34
        Connected hosts :
                 esxi-dell-h.rainpole.com


Found Datastore: vsanDatastore
        State            :  Available
        Datastore Type   :  vsan
        Capacity (GB)    :  5,961.63
        Free Space (GB)  :  3,574.25
        Connected hosts :
                 esxi-dell-g.rainpole.com
                 esxi-dell-h.rainpole.com
                 esxi-dell-f.rainpole.com
                 esxi-dell-e.rainpole.com


Found Datastore: vVolDatastore
        State            :  Available
        Datastore Type   :  VVOL
        Capacity (GB)    :  8,388,608.00
        Free Space (GB)  :  8,388,538.38
        Connected hosts :
                 esxi-dell-g.rainpole.com
                 esxi-dell-h.rainpole.com
                 esxi-dell-f.rainpole.com
                 esxi-dell-e.rainpole.com

etc, etc.

The ‘connected’ status is useful as it tells you which ESXi hosts have access to the datastore, i.e. is it shared?

Let’s stay on the topic of storage. Let’s say that as a K8s cluster admin, I want to create some StorageClasses. When K8s is running on vSphere, StorageClasses can consume Storage Policies built on vSphere. However, how do I know what Storage Policies are available? How can I get the details around the Storage Policies?

How can I get details about vSphere Storage Policies?

Let’s start by getting information about the various Storage Policies that can then be incorporated into a StorageClass. We can use the -s option for this.

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -s

*** This command is being run against the following Kubernetes configuration context:  cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***

*** These are Storage Policies in use on the vSphere Infrastructure which could potentially be used for Kuberenetes StorageClasses

        Found Policy: VVol No Requirements Policy
        Found Policy: RAID-5
        Found Policy: Space-Efficient1
        Found Policy: VM Encryption Policy
        Found Policy: vvol-simple
        Found Policy: vvol-snaps
        Found Policy: raid-1
        Found Policy: silver
        Found Policy: OSR-0
        Found Policy: gold
        Found Policy: OSR-100
        Found Policy: Host-local PMem Default Storage Policy
        Found Policy: Space-Efficient
        Found Policy: vSAN Default Storage Policy
        Found Policy: bronze
        Found Policy: VMcrypt

Now that we have the list of policies, we can query each of them using the advanced option -sp.

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -sp raid-1

*** This command is being run against the following Kubernetes configuration context:  cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***

Display Detailed Policy attributes of: raid-1

        Found Policy Attribute : (VSAN.hostFailuresToTolerate=1) AND \
(VSAN.replicaPreference=RAID-1 (Mirroring) - Performance) AND \
(VSAN.checksumDisabled=False) AND (VSAN.stripeWidth=1) AND \
(VSAN.forceProvisioning=False) AND (VSAN.iopsLimit=0) AND \
(VSAN.cacheReservation=0) AND (VSAN.proportionalCapacity=0)

Here is another example, but this time querying a vVols datastore. Please note that at the time of writing, vVols is not supported with the vSphere CSI driver. However, it is something we hope to have supported very soon.

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -sp vvol-snaps

*** This command is being run against the following Kubernetes configuration context: cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***

Display Detailed Policy attributes of: vvol-snaps

Found Policy Attribute : (com.purestorage.storage.policy.PureFlashArray=True) AND \
(com.purestorage.storage.replication.LocalSnapshotPolicyCapable=True) AND \
(com.purestorage.storage.replication.LocalSnapshotInterval=01:00:00) AND \
(com.purestorage.storage.replication.LocalSnapshotRetention=7.00:00:00)

Now lets assume that we need to troubleshoot a PV (persistent volume) issue. How can you map a PV to an actual VMDK on vSphere to do some further investigation? Which datastore is it on? What policy was used to create it? Let’s see how vtopology can help here.

How can I determine the VMDK that backs a PV?

We can already get some information from the PV and PVC outputs. We can even describe them for further information. However, there is no detail about the VMDK used to back the PV in vSphere in these outputs.

$ kubectl get pvc -n cassandra
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cassandra-data-cassandra-0   Bound    pvc-e2989b44-2e27-11ea-80e4-005056a239d9   1Gi        RWO            cass-sc-csi    3m28s
cassandra-data-cassandra-1   Bound    pvc-01936124-2e28-11ea-80e4-005056a239d9   1Gi        RWO            cass-sc-csi    2m36s
cassandra-data-cassandra-2   Bound    pvc-3bdd199d-2e28-11ea-80e4-005056a239d9   1Gi        RWO            cass-sc-csi    58s


$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                  STORAGECLASS   REASON   AGE
pvc-01936124-2e28-11ea-80e4-005056a239d9   1Gi        RWO            Delete           Bound    cassandra/cassandra-data-cassandra-1   cass-sc-csi             2m28s
pvc-3bdd199d-2e28-11ea-80e4-005056a239d9   1Gi        RWO            Delete           Bound    cassandra/cassandra-data-cassandra-2   cass-sc-csi             56s
pvc-e2989b44-2e27-11ea-80e4-005056a239d9   1Gi        RWO            Delete           Bound    cassandra/cassandra-data-cassandra-0   cass-sc-csi             3m33s


$ kubectl describe pvc cassandra-data-cassandra-0 -n cassandra
Name:          cassandra-data-cassandra-0
Namespace:     cassandra
StorageClass:  cass-sc-csi
Status:        Bound
Volume:        pvc-e2989b44-2e27-11ea-80e4-005056a239d9
Labels:        app=cassandra
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-class: cass-sc-csi
               volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Events:
  Type       Reason                Age                    From                                                                                  Message
  ----       ------                ----                   ----                                                                                  -------
  Normal     ExternalProvisioning  7m52s (x2 over 7m52s)  persistentvolume-controller                                                           waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator
  Normal     Provisioning          7m52s                  csi.vsphere.vmware.com_vsphere-csi-controller-0_2dacb016-230d-11ea-b0c6-9e255ee28255  External provisioner is provisioning volume for claim "cassandra/cassandra-data-cassandra-0"
Mounted By:  cassandra-0


$ kubectl describe pv pvc-01936124-2e28-11ea-80e4-005056a239d9
Name:            pvc-01936124-2e28-11ea-80e4-005056a239d9
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com]
StorageClass:    cass-sc-csi
Status:          Bound
Claim:           cassandra/cassandra-data-cassandra-1
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        1Gi
Node Affinity:   <none>
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            csi.vsphere.vmware.com
    VolumeHandle:      8fe0fc5b-c42b-42f5-bde0-3ec992d45d15
    ReadOnly:          false
    VolumeAttributes:      fstype=
                           storage.kubernetes.io/csiProvisionerIdentity=1576835009605-8081-csi.vsphere.vmware.com
                           type=vSphere CNS Block Volume
Events:                <none>

Let’s now see what vSphere information vtopology can report about the persistent volume:

$ kubectl vtopology -vc vcsa-06-b.rainpole.com -u administrator@vsphere.local -p **** -pv pvc-01936124-2e28-11ea-80e4-005056a239d9

*** This command is being run against the following Kubernetes configuration context:  cork8s-csi-01

*** To switch to another context, use the kubectl config use-context command ***

=== vSphere Datastore information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 ===

        Datastore Name     :  vsanDatastore
        Datastore State    :  Available
        Datastore Type     :  vsan
        Capacity (GB)      :  5961.625
        Free Space (GB)    :  3574.25


=== Virtual Machine Disk (VMDK) information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 ===

        VMDK Name          :  d7f9cc0f8c614c97b579c6af8d0ffc87.vmdk
        VMDK Type          :  Flat
        VMDK Capacity (GB) :  1
        VMDK Filename      :  [vsanDatastore] 33d05a5d-e436-3297-94f7-246e962f4910/d7f9cc0f8c614c97b579c6af8d0ffc87.vmdk


=== Storage Policy (SPBM) information for PV pvc-01936124-2e28-11ea-80e4-005056a239d9 ===

        Kubernetes VM/Node :  k8s-worker4
        Hard Disk Name     :  Hard disk 2
        Policy Name        :  raid-1
        Policy Compliance  :  compliant

Not bad. Now we can see the datastore, the policy and event the path to the VMDK file. This should work for K8s clusters using the original VCP driver (e.g. Enterprise PKS) and the newer clusters using the CSI driver.

Some of the other items in vtopology are a work in progress. I’m not sure what useful information I could display from a networking perspective just yet. Ideas welcomed. Now for those of us who have access to both vSphere and Kubernetes, switching contexts between the two is probably second nature. But for those K8s cluster administrators who may not have access to vSphere, vtopology could be very useful (at least in my opinion).

Disclaimer

vtopology has only been tested against natively deployed Kubernetes clusters (using kubeadm) on vSphere, as well as Enterprise PKS. It has not been tested against any RedHat OpenShift deployments running on vSphere, or Google Anthos on-prem deployments. If anyone has the chance to try vtopology out on those platforms, I’d be very interested to hear how it goes.

Note that this is not VMware code – it is my own code. Therefore the content is provided “as-is”. This means that the code is used at the users own risk, and is provided without warranty, and no liability for damages resulting from using the code. The code can also change at any time as new updates are added.

Where can I get it?

vtopology is available on GitHub – you can get it here: https://github.com/cormachogan/vtopology

Feedback

Let me know what you think. What can be done to improve it? What other parts of the vSphere infrastructure would be useful to query from vtopology?

One Reply to “vtopology – Insights into vSphere infrastructure from kubectl”

Comments are closed.