Announcing vSphere CSI driver v2.5 metrics for Prometheus monitoring

This post will look at another new feature that has been added to the vSphere CSI driver v2.5. This feature enables the exposing of CSI metrics so that they can be collected by Prometheus and stored as time series data. Using the information captured in Prometheus, we can build Grafana dashboards which makes is easy to monitor the health and stability of the CSI driver. Kudos to one of our vSphere CSI driver engineers, Liping Xue, who did a great write-up on how to test this feature, and who’s content I relied on heavily to create this post.

In the vSphere CSI controller pod, there are two containers that expose metrics. The first is the vsphere-csi-controller container which provides the communication from the Kubernetes Cluster API server to the CNS component on vCenter server for volume lifecycle operations. The second is the vsphere-syncer container which sends metadata information back about persistent volumes to the CNS component on vCenter server so that it can be displayed in the vSphere client UI in the Container Volumes view. The vsphere-csi-controller container exposes Prometheus metrics from port 2112, while the vsphere-syncer container exposes Prometheus metrics from port 2113. The full list of metrics that are exposed by the CSI driver are here available in the official docs. We can also check this on our cluster.

% kubectl get svc -n vmware-system-csi
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)            AGE
vsphere-csi-controller  ClusterIP  <none>        2112/TCP,2113/TCP  26d

There are multiple ways to deploy a Prometheus monitoring stack. There is a Prometheus Operator, kube-prometheus, Helm charts maintained by the Prometheus community, and if using TKGm, there is the Carvel tools approach which I have blogged about previously. Since this is a vanilla, upstream K8s cluster where the vSphere CSI driver v2.5 is deployed, I will use the kube-prometheus approach and follow the quickstart guide to very quickly stand up a Prometheus monitoring stack which includes AlertManager and Grafana.

Step 1. Clone the kube-prometheus repository from GitHub

% git clone
Cloning into 'kube-prometheus'...
remote: Enumerating objects: 15523, done.
remote: Counting objects: 100% (209/209), done.
remote: Compressing objects: 100% (119/119), done.
remote: Total 15523 (delta 126), reused 123 (delta 78), pack-reused 15314
Receiving objects: 100% (15523/15523), 7.79 MiB | 542.00 KiB/s, done.
Resolving deltas: 100% (9884/9884), done.

% cd kube-prometheus

% ls experimental go.mod
DCO go.sum
LICENSE jsonnet
Makefile jsonnetfile.json jsonnetfile.lock.json kubescape-exceptions.json kustomization.yaml manifests
developer-workspace scripts
docs sync-to-internal-registry.jsonnet
example.jsonnet tests

Step 2. Apply the kube-prometheus manifests

This is done in two steps. The first is to create the Custom Resource Definitions used by the Prometheus stack, then deploy the actual Prometheus stack objects.

% kubectl apply --server-side -f manifests/setup serverside-applied serverside-applied serverside-applied serverside-applied serverside-applied serverside-applied serverside-applied serverside-applied
namespace/monitoring serverside-applied

% kubectl get crd
NAME                                             CREATED AT        2022-03-09T09:16:24Z              2022-03-09T09:16:25Z                                2022-02-18T11:12:17Z                 2022-02-18T11:12:17Z        2022-02-10T11:38:35Z                 2022-02-10T11:38:55Z                   2022-02-18T11:12:17Z                       2022-02-18T11:12:17Z                2022-03-09T09:16:25Z                       2022-02-18T11:12:17Z                      2022-02-18T11:12:17Z                     2022-03-09T09:16:25Z               2022-03-09T09:16:26Z            2022-03-09T09:16:27Z                     2022-02-18T11:12:17Z                               2022-02-18T11:12:17Z                              2022-02-18T11:12:18Z                   2022-02-18T11:12:18Z            2022-03-09T09:16:27Z               2022-03-09T09:16:27Z    2022-02-10T11:48:15Z   2022-02-10T11:48:16Z                2022-02-18T11:12:18Z          2022-02-10T11:48:17Z

A significant number of K8s objects will be created when the contents of the manifest folder are deployed.

% kubectl apply -f manifests/ created
poddisruptionbudget.policy/alertmanager-main created created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created created created created
configmap/blackbox-exporter-configuration created
deployment.apps/blackbox-exporter created
service/blackbox-exporter created
serviceaccount/blackbox-exporter created created
secret/grafana-config created
secret/grafana-datasources created
configmap/grafana-dashboard-alertmanager-overview created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-grafana-overview created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created created
service/grafana created
serviceaccount/grafana created created created created created
deployment.apps/kube-state-metrics created created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created created created created created created created created created created
daemonset.apps/node-exporter created created
service/node-exporter created
serviceaccount/node-exporter created created created created
poddisruptionbudget.policy/prometheus-k8s created created created created created created created created created created created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created created created created created created created created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
poddisruptionbudget.policy/prometheus-adapter created created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created created created created
deployment.apps/prometheus-operator created created
service/prometheus-operator created
serviceaccount/prometheus-operator created created

% kubectl get servicemonitors -A
NAMESPACE    NAME                      AGE
monitoring   alertmanager-main         46s
monitoring   blackbox-exporter         45s
monitoring   coredns                   25s
monitoring   grafana                   27s
monitoring   kube-apiserver            25s
monitoring   kube-controller-manager   24s
monitoring   kube-scheduler            24s
monitoring   kube-state-metrics        26s
monitoring   kubelet                   24s
monitoring   node-exporter             23s
monitoring   prometheus-adapter        17s
monitoring   prometheus-k8s            20s
monitoring   prometheus-operator       16s

% kubectl get pod -n monitoring
NAME                                  READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2    Running   0          95s
alertmanager-main-1                    2/2    Running   0          95s
alertmanager-main-2                    2/2    Running   0          95s
blackbox-exporter-7d89b9b799-svr4t     3/3    Running   0          2m5s
grafana-5577bc8799-b5bnd               1/1    Running   0          107s
kube-state-metrics-d5754d6dc-spx4w     3/3    Running   0          106s
node-exporter-8b44z                    2/2    Running   0          103s
node-exporter-jrxrc                    2/2    Running   0          103s
node-exporter-pj7nb                    2/2    Running   0          103s
prometheus-adapter-6998fcc6b5-dlqk6    1/1    Running   0          97s
prometheus-adapter-6998fcc6b5-qswk4    1/1    Running   0          97s
prometheus-k8s-0                       2/2    Running   0          94s
prometheus-k8s-1                       2/2    Running   0          94s
prometheus-operator-59647c66cf-ldppj   2/2    Running   0          96s

% kubectl get svc -n monitoring
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main       ClusterIP   <none>        9093/TCP,8080/TCP            7m18s
alertmanager-operated   ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   6m47s
blackbox-exporter       ClusterIP   <none>        9115/TCP,19115/TCP           7m17s
grafana                 ClusterIP    <none>        3000/TCP                     7m
kube-state-metrics      ClusterIP   None            <none>        8443/TCP,9443/TCP            6m58s
node-exporter           ClusterIP   None            <none>        9100/TCP                     6m55s
prometheus-adapter      ClusterIP    <none>        443/TCP                      6m50s
prometheus-k8s          ClusterIP   <none>        9090/TCP,8080/TCP            6m53s
prometheus-operated     ClusterIP   None            <none>        9090/TCP                     6m46s
prometheus-operator     ClusterIP   None            <none>        8443/TCP                     6m49s

Step 3. Adjust the ClusterRole prometheus-k8s

One necessary adjust is to the prometheus-k8s ClusterRole. When deployed through kube-prometheus, it does not have the necessary apiGroup resources and verbs rules to pick up vSphere CSI metrics. Therefore, it needs to be modified with the necessary rules before proceeding any further. Below, the rules when the ClusterRole is first created are displayed, followed by a new manifest which updates the rules. Lastly, the updated ClusterRole is displayed.

% kubectl get ClusterRole prometheus-k8s -o yaml
kind: ClusterRole
  annotations: |
  creationTimestamp: "2022-03-09T09:19:39Z"
  labels: prometheus k8s prometheus kube-prometheus 2.33.4
  name: prometheus-k8s
  resourceVersion: "7283142"
  uid: e18f021c-3e6e-4162-98ca-bbf912b75b06
- apiGroups:
  - ""
  - nodes/metrics
  - get
- nonResourceURLs:
  - /metrics
  - get

% cat prometheus-clusterRole-updated.yaml
kind: ClusterRole
  labels: prometheus k8s prometheus kube-prometheus 2.33.0
  name: prometheus-k8s
- apiGroups:
  - ""
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- nonResourceURLs:
  - /metrics
  - get

% kubectl apply -f prometheus-clusterRole-updated.yaml configured

% kubectl get ClusterRole prometheus-k8s -o yaml
kind: ClusterRole
  annotations: |
  creationTimestamp: "2022-03-09T09:19:39Z"
  labels: prometheus k8s prometheus kube-prometheus 2.33.0
  name: prometheus-k8s
  resourceVersion: "7284231"
  uid: e18f021c-3e6e-4162-98ca-bbf912b75b06
- apiGroups:
  - ""
  - nodes
  - services
  - endpoints
  - pods
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  - get

Step 4. Create Service Monitor

To monitor any service through Prometheus, such as the vSphere CSI driver, a ServiceMonitor object must be created. The following is the manifest for the ServiceMonitor object that will be used to monitor “vsphere-csi-controller” service. The endpoints refer to ports 2112 (ctlr) and 2113 (syncer) respectively. Once deployed, the list of ServiceMonitors can be checked to ensure it it running.

% cat vsphere-csi-controller-service-monitor.yaml
kind: ServiceMonitor
  name: vsphere-csi-controller-prometheus-servicemonitor
  namespace: monitoring
    name: vsphere-csi-controller-prometheus-servicemonitor
      app: vsphere-csi-controller
    - vmware-system-csi
  - port: ctlr
  - port: syncer

% kubectl apply -f vsphere-csi-controller-service-monitor.yaml created

% kubectl get servicemonitors -A
NAMESPACE    NAME                                              AGE
monitoring  alertmanager-main                                  9m32s
monitoring  blackbox-exporter                                  9m31s
monitoring  coredns                                            9m11s
monitoring  grafana                                            9m13s
monitoring  kube-apiserver                                     9m11s
monitoring  kube-controller-manager                            9m10s
monitoring  kube-scheduler                                     9m10s
monitoring  kube-state-metrics                                 9m12s
monitoring  kubelet                                            9m10s
monitoring  node-exporter                                      9m9s
monitoring  prometheus-adapter                                 9m3s
monitoring  prometheus-k8s                                     9m6s
monitoring  prometheus-operator                                9m2s
monitoring  vsphere-csi-controller-prometheus-servicemonitor   42s

At this point, it is a good idea to check the logs on the prometheus-k8s-* nodes in the monitoring namespace. If there are issues with scraping metrics from the vSphere CSI driver, they will appear here. If you have not correctly updated the ClusterRole as mentioned previously, you may observe errors similar to this:

ts=2022-03-07T15:15:06.580Z caller=klog.go:116 level=error component=k8s_client_runtime    
func=ErrorDepth msg="pkg/mod/       
Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: 
User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\"
in API group \"\" in the namespace \"vmware-system-csi\""

Step 5. Launch Prometheus UI

In step 2, after deploying the manifests, the list of services was displayed. You may have notices that prometheus-k8s service was of type ClusterIP. This means that it is an internal service and not accessible externally.

% kubectl get svc prometheus-k8s -n monitoring
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP  PORT(S)            AGE
prometheus-k8s  ClusterIP  <none>        9090/TCP,8080/TCP  97m

There are various ways to address this, such as change the service type to NodePort or LoadBalancer if you have one available to provide LoadBalancer IPs. The easiest way, for the purposes of our testing, is to simply port-forward the service and port (9090) and make it accessible from local host.

% kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
Forwarding from -> 9090
Forwarding from [::1]:9090 -> 9090

Now you can open a browser on your desktop and connect to http://localhost:9090 to see the Prometheus UI. Here we can now check if we are getting metrics from the vSphere CSI driver, as described in the official docs. We should be able to see metrics called vsphere-csi-info and vsphere-syncer-info, amnog others. Simply type the name into the query field, and see if it is visible. Click on the Execute button to see further info. This is the controller info.

This is the syncer info.

This all looks good so we can proceed with launching the Grafana portal and creating a dashboard to display the metrics.

Step 6. Launch Grafana UI, create dashboard

The Grafana UI can be accessed in much the same way as the Prometheus UI. Again, it has been deployed with a ClusterIP type service, so not accessible outside of the Kubernetes Cluster. We can once again use the port-forward fucntionality to access it from a browser on the local host. This time the port is 3000.

% kubectl get svc grafana -n monitoring
grafana   ClusterIP  <none>        3000/TCP  98m

% kubectl --namespace monitoring port-forward svc/grafana 3000
Forwarding from -> 3000
Forwarding from [::1]:3000 -> 3000

We can now access the Grafana UI via http://localhost:3000. This is the initial login screen. Username and password is admin / admin.

On initial login, you are asked to provide a new password. You can add a new password or choose to skip this test.

The next step is to create a dashboard for the vSphere CSI driver metrics. The good news is that Liping has already created some sample Grafana dashboards for us to use, and these are available on GitHub here. The vSphere CSI dashboard shows metrics for CSI operations, whilst the vSphere CSI-CNS dashboard shows metrics for CNS operations observed at CSI layer. You can use these dashboards as a building block to create your own bespoke dashboards, should you so wish. Once logged in click on the + sign on the left hand side of the Grafana UI, and then select Import.

This opens a wizard that allows you to import a dashboard directly from Grafana using the dashboard URL or ID, or to copy and paste JSON contents for a dashboard. We can copy and paste the raw JSON from Liping’s dashboards on GitHub into the panel, as shown below. Once the JSON contents are pasted, click Load.

After loading, the only other task is to set the Prometheus Source. You simply select prometheus from the dropdown list (there should only be one), and click Import.

And now you should begin to see the vSphere CSI driver metrics that have been scraped and stored by Prometheus displayed in the Grafana dashboard. Leave it running for a while and you should begin to see some graphs populating, similar to the dashboard below.

And that completes the setup. You can now observe various vSphere CSI driver metrics appearing in the Grafana dashboards. Kudos once again to Liping Xue for doing the groundwork and documenting how to stand up and environment to demonstrate the metrics feature in version 2.5. The manifests used to create the correct ClusterRole and the Prometheus service monitor can be found here.