VCP to vSphere CSI Migration in Kubernetes

Cormac

3 years ago

When VMware first introduced support for Kubernetes, our first storage driver was the VCP, the in-tree vSphere Cloud Provider. Some might remember that this driver was referred to as Project Hatchway back in the day. This in-tree driver allows Kubernetes to consume vSphere storage for persistent volumes. One of draw-backs to the in-tree driver approach was that every storage vendor had to include their own driver in each Kubernetes distribution, which ballooned the core Kubernetes code and made maintenance difficult. Another drawback of this approach was that vendors typically had to wait for a new version of Kubernetes to release before they could patch and upgrade their own in-tree driver. This led to the creation of the Container Storage interface (CSI) specification, which essentially defined a plug-in mechanism for storage drivers, and allowed vendors to release their own updates to the storage driver as needed, which means we no longer have to wait for a new Kubernetes release for patches and updates.

The Kubernetes community has been talking about deprecating the in-tree drivers for some time, and the onus is on the various vendors to deliver a CSI driver in its place. In fact, many of the in-tree drivers are no longer maintained, so the in-tree drivers no longer get new features such as online extend, and volume snapshots. The vSphere CSI has already been available for a number of years, but the in-tree VCP is still in use. In this post, I will discuss the new vSphere CSI Migration “beta” feature, which seamlessly transfers VCP volume requests to the underlying vSphere CSI driver. This means that even when the VCP is deprecated and removed from Kubernetes, any applications using VCP volumes won’t be impacted since volume operations will be migrated to / handled by the vSphere CSI driver.

Note: After running this experiment, I noticed that a new release of the vSphere CSI driver - version 
2.2.0 - has just released. Whilst CSI Migration will work with both CSI version 2.1.x and 2.2.0, it 
is recommended that you use version 2.2.0 as this has some additional CSI Migration updates.

Prerequisites

There is a complete documented process to implement CSI Migration available. I will provide a step-by-step procedure here. There are a number of prerequisites to begin:

vSphere 7.0U1 minimum
Kubernetes v1.19 minimum
CPI – The vSphere Cloud Provider Interface
CSI – The vSphere Cloud Storage Interface (v2.1.x minimum, v2.2.0 preferred)

Also ensure that your local kubectl binary is at version 1.19 or above. Here is what I used in my lab.

$ kubectl version --short
Client Version: v1.20.5
Server Version:v1.20.5

Deploy vSphere CSI with Migration enabled

Installation of vSphere, Kubernetes cluster and the CPI (Cloud Provider Interface) are beyond the scope of this article. The CSI driver installation also follows the standard approach for upstream Kubernetes, with one modification. The CSI controller manifest YAML needs to have the ConfigMap called internal-feature-states.csi.vsphere.vmware.com updated with the csi-migration field set to true. After making this change, deploy both the controller and node manifests as normal.

---
apiVersion: v1
data:
"csi-migration": "true" # csi-migration feature is only available for vSphere 7.0U1
kind: ConfigMap
metadata:
name: internal-feature-states.csi.vsphere.vmware.com
namespace: kube-system
---

Install an admission webhook

This webhook enables validation which prevents users from creating or updating a StorageClass with migration specific parameters which are note allowed in the vSphere CSI storage provisioner. This is because certain StorageClass parameters which were supported in the VCP are no longer supported in CSI. Scripts are provided to set the webhook up. First, there is a script to take care of creating the appropriate certificate. Here is how to download and run it, and some example output.

$ curl -O https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/release-2.1/manifests/v2.1.0/vsphere-7.0u1/vanilla/deploy/generate-signed-webhook-certs.sh
$ chmod a+x ./generate-signed-webhook-certs.sh
$ ./generate-signed-webhook-certs.sh
creating certs in tmpdir /tmp/tmp.ChlzHFg1aF
Generating RSA private key, 2048 bit long modulus
.............................................+++
.......................................+++
e is 65537 (0x10001)
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
certificatesigningrequest.certificates.k8s.io/vsphere-webhook-svc.kube-system created
NAME                              AGE   SIGNERNAME                     REQUESTOR          CONDITION
vsphere-webhook-svc.kube-system   0s    kubernetes.io/legacy-unknown   kubernetes-admin   Pending
certificatesigningrequest.certificates.k8s.io/vsphere-webhook-svc.kube-system approved


$ kubectl get csr -A
NAME                              AGE   SIGNERNAME                     REQUESTOR          CONDITION
vsphere-webhook-svc.kube-system   13s   kubernetes.io/legacy-unknown   kubernetes-admin   Approved,Issued

An additional manifest and script are provided which create the remaining webhook objects including a webhook deployment/Pod, Service Account, Cluster Role, Role Binding, and the Service to bind with webhook pod.

$ curl -O https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/release-2.1/manifests/v2.1.0/vsphere-7.0u1/vanilla/deploy/validatingwebhook.yaml
$ curl -O https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/release-2.1/manifests/v2.1.0/vsphere-7.0u1/vanilla/deploy/create-validation-webhook.sh
$ chmod a+x ./create-validation-webhook.sh
$ ./create-validation-webhook.sh
service/vsphere-webhook-svc created
validatingwebhookconfiguration.admissionregistration.k8s.io/validation.csi.vsphere.vmware.com created
serviceaccount/vsphere-csi-webhook created
clusterrole.rbac.authorization.k8s.io/vsphere-csi-webhook-role created
clusterrolebinding.rbac.authorization.k8s.io/vsphere-csi-webhook-role-binding created
deployment.apps/vsphere-csi-webhook created

These commands check that the webhook Pod and Service deployed successfully.

$ kubectl get pods -A | grep csi
kube-system   vsphere-csi-controller-6f7484d584-f5kcd       6/6     Running       0          3h45m
kube-system   vsphere-csi-node-f7pgs                        3/3     Running       0          3h42m
kube-system   vsphere-csi-node-fpxdl                        3/3     Running       0          3h42m
kube-system   vsphere-csi-node-lkgg5                        3/3     Running       0          3h42m
kube-system   vsphere-csi-node-w8kt9                        3/3     Running       0          3h42m
kube-system   vsphere-csi-webhook-7897cfb96d-9bxwf          1/1     Running       0          3h14m

$ kubectl get svc vsphere-webhook-svc -n kube-system
NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
vsphere-webhook-svc   ClusterIP   10.106.219.128   <none>        443/TCP   3h19m


$ kubectl get deploy vsphere-csi-webhook -n kube-system
NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
vsphere-csi-webhook   1/1     1            1           42s

Enabling CSI Migration on the Kubernetes Cluster

We now need to make some adjustments to the Kubernetes Cluster to support VCP -> CSI Migration. Since this is still a beta feature, we need to add some feature flags to the kube-controller on the control plane node(s) and to the kubelet on all control plane and worker/workload nodes. The feature flags are CSIMigrationvSphere and CSIMigration. The CSIMigrationvSphere flag routes volume operations from the vSphere in-tree VCP plugin to vSphere CSI plugin. If there is no vSphere CSI driver installed, the operations revert to using the in-tree VCP. The CSIMigrationvSphere feature flag requires the CSIMigration feature flag.

kube-controller changes on the control plane node(s)

Logon to the control plane node(s) and edit the /etc/kubernetes/manifests/kube-controller-manager.yaml manifest to add the feature-gates entry, highlighted in blue, at the bottom of the file below:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cloud-config=/etc/kubernetes/vsphere.conf
    - --cloud-provider=vsphere
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --port=0
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    - --feature-gates=CSIMigration=true,CSIMigrationvSphere=true
    image: k8s.gcr.io/kube-controller-manager:v1.20.5

kubelet config changes on the control plane node(s)

Logon to the control plane node(s) and edit the kubelet /var/lib/kubelet/config.yaml manifest to add the feature-gates entries, highlighted in blue, at the bottom of the file below:

apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
resolvConf: /run/systemd/resolve/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
featureGates:
  CSIMigration: true
  CSIMigrationvSphere: true

After making the changes, the kubelet should be restarted on the control plane node(s) as follows. It is advisable to check on the status of the kubelet afterwards to make sure it starts. If it fails to restart, it may be due to a typo you placed in the configuration file. A command such as Ubuntu’s journalctl -xe -u kubelet can be used to check for reasons why the kubelet did not restart successfully.

$ sudo systemctl restart kubelet

$ sudo systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Tue 2021-04-13 09:26:47 UTC; 1s ago
       Docs: https://kubernetes.io/docs/home/
   Main PID: 3848205 (kubelet)
      Tasks: 10 (limit: 4676)
     Memory: 22.3M
     CGroup: /system.slice/kubelet.service
             └─3848205 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cloud-config=/etc/kubernetes/vsphere.co>

This will trigger a restart of the control plane services in kube-system. Use the kubectl get pods -A command to monitor etcd, api-server and controller manager. These should be Running and Ready 1/1 within a few minutes. Wait for everything to be available before proceeding with further changes.

kubelet config changes on the worker node(s)

This procedure for the worker nodes is the same as what we did on the control plane nodes, except that before we make the changes, we need to drain the worker nodes of any running applications. Once the worker node has been drained, logon to the node, change the feature gates on the kubelet config, restart the kubelet, logout, then uncordon the node so that applications can once again be scheduled on it. The feature gate entries for the worker node kubelet config are exactly the same as the feature gate entries on the control nodes.

$ kubectl drain k8s-worker-01 --force --ignore-daemonsets
node/k8s-worker-01 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-r899q, kube-system/kube-proxy-dvktz, kube-system/vsphere-csi-node-lkgg5
node/k8s-worker-01 drained

$ ssh ubuntu@k8s-worker-01

ubuntu@k8s-worker-01:~$ sudo vi /var/lib/kubelet/config.yaml
<<-- add feature gates entries and save the file -->>
featureGates:
  CSIMigration: true
  CSIMigrationvSphere: true

ubuntu@k8s-worker-01:~$ sudo systemctl restart kubelet

ubuntu@k8s-worker-01:~$ sudo systemctl status kubelet

ubuntu@k8s-worker-01:~$ exit


$ kubectl uncordon k8s-worker-01
node/k8s-worker-01 uncordoned

Repeat this process for all worker nodes. Again, if there are issues with the edits you made to the config file, you can use a command similar to Ubuntu’s journalctl -xe -u kubelet to examine the logs and see what is causing the error.

Migration Annotations

We should now be able to check the annotations on nodes, Pods, PVCs and PVs to verify that these are indeed migrated to use the CSI driver.

CSINode Migration Annotations

Each CSINode object should now display a migrated-plugins annotation, as shown here.

$ kubectl describe csinodes k8s-controlplane-01 | grep Annot
Annotations:        storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/vsphere-volume

$ kubectl describe csinodes k8s-worker-03 | grep Annot
Annotations:        storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/vsphere-volume

$ kubectl describe csinodes k8s-worker-02 | grep Annot
Annotations:        storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/vsphere-volume

$ kubectl describe csinodes k8s-worker-01 | grep Annot
Annotations:        storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/vsphere-volume

PVC Migration Annotations

In this Kubernetes cluster, I had a previously deployed an application (a NoSQL Cassandra DB). It is deployed as a StatefulSet with 3 replicas, thus 3 Pods and 3 PVCs and 3 PVs. It was using the VCP when originally deployed. If I check the Persistent Volume Claims (PVCs) of this application, I should now see a new migration annotation, as highlighted in blue below.

$ kubectl describe pvc cassandra-data-cassandra-0 -n cassandra
Name:          cassandra-data-cassandra-0
Namespace:     cassandra
StorageClass:  cass-sc-vcp
Status:        Bound
Volume:        pvc-900b85d6-c2a8-4996-a7a4-ec9a23bffd77
Labels:        app=cassandra
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               pv.kubernetes.io/migrated-to: csi.vsphere.vmware.com
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/vsphere-volume
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       cassandra-0
Events:        <none>

PV Migration Annotations

A similar annotation should also be associated with the Persistent Volumes used by the StatefulSet.

$ kubectl describe pv pvc-affa795f-db69-4456-8baa-6a2fe9f19d2e -n cassandra
Name:            pvc-affa795f-db69-4456-8baa-6a2fe9f19d2e
Labels:          <none>
Annotations:     kubernetes.io/createdby: vsphere-volume-dynamic-provisioner
                 pv.kubernetes.io/bound-by-controller: yes
                 pv.kubernetes.io/migrated-to: csi.vsphere.vmware.com
                 pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volume
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com]
StorageClass:    cass-sc-vcp
Status:          Bound
Claim:           cassandra/cassandra-data-cassandra-2
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        1Gi
Node Affinity:   <none>
Message:
Source:
    Type:               vSphereVolume (a Persistent Disk resource in vSphere)
    VolumePath:         [vsanDatastore-OCTO-Cluster-B] d85b7460-17a6-6ea1-4b39-246e962f497c/kubernetes-dynamic-pvc-affa795f-db69-4456-8baa-6a2fe9f19d2e.vmdk
    FSType:             ext4
    StoragePolicyName:  vsan-b
Events:                 <none>

Note that the PV specification continues to keep the original VCP Volume Path. If there is ever any reason to disable the CSI migration feature, the volume operations can revert to using the in-tree VCP plugin.

Caution: During my testing, I found that a limitation on the CSI Volume ID which limits the number of 
characters in the volumes ID path to 128. Thus, if you have a vSphere datastore name that is longer than 24 
characters, you may hit an issue with migrating pre-existing VCP volumes to CSI. This has been reported as 
a Kubernetes issue. By example, here is a test with different length datastore names:

(CSI migration works - volume id length < 128 )
VolumePath: [vsan-OCTO-Cluster-B] d85b7460-17a6-6ea1-4b39-246e962f497c/
kubernetes-dynamic-pvc-920d248d-7e66-451a-a291-c7dfa3458a72.vmdk

(CSI migration fails - volume id length > 128)
VolumePath: [vsanDatastore-OCTO-Cluster-B] d85b7460-17a6-6ea1-4b39-246e962f497c/
kubernetes-dynamic-pvc-91c5826f-ed8e-4d55-a5bc-c4b31e435f5d.vmdk

Functionality test

Since it is now the CSI driver that is doing the volume provisioning, we are no longer able to use StorageClass parameters that we were able to use with the VCP. For example, if we now try to provision a volume that has the StorageClass parameter diskformat: eagerzeroedthick, which is supported with the VCP but not the CSI, the volume provisioning will fail (as described in the webhook section earlier).

$ cat vsan-sc-vcp.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: vsan-sc-vcp
provisioner: kubernetes.io/vsphere-volume
parameters:
    diskformat: eagerzeroedthick
    storagePolicyName: "vsan-b"
    datastore: "vsanDatastore-OCTO-Cluster-B”

The following will be observed if you use kubectl to describe the PVC and check the events; eagerzeroedthick is no longer a valid diskformat parameter.

Events:
  Type     Reason                Age                            From                                                                                                 Message
  ----     ------                ----                           ----                                                                                                 -------
  Normal   ExternalProvisioning  <invalid> (x2 over <invalid>)  persistentvolume-controller                                                                          waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator
  Normal   Provisioning          <invalid> (x4 over <invalid>)  csi.vsphere.vmware.com_vsphere-csi-controller-6f7484d584-f5kcd_f0b90478-2001-4217-b5f2-34cd53ef4c86  External provisioner is provisioning volume for claim "default/vsan-pvc-vcp"
  Warning  ProvisioningFailed    <invalid> (x4 over <invalid>)  csi.vsphere.vmware.com_vsphere-csi-controller-6f7484d584-f5kcd_f0b90478-2001-4217-b5f2-34cd53ef4c86  failed to provision volume with StorageClass "vsan-sc-vcp": rpc error: code = InvalidArgument desc = Parsing storage class parameters failed with error: invalid parameter. key:diskformat-migrationparam, value:eagerzeroedthick

Hopefully this has provided you with some good guidance on how to implement VCP->CSI Migration. Note that this feature is currently beta, so is not yet permanently enabled in upstream Kubernetes distributions. However the feature flags can be enabled as shown here to allow you to test the procedure and verify that in-tree VCP volumes can now be managed with the vSphere CSI driver. Keep in mind that eventually the VCP will become deprecated and will no longer be available in Kubernetes distribution. This migration mechanism should alleviate any concerns around that deprecation.

Added benefit: VCP volumes now visible in CNS

Since VCP volumes operations are now handled by the vSphere CSI driver, the Cloud Native Storage (CNS) component on vSphere can now bubble up volume information to the vSphere client. The volumes for my NoSQL Cassandra StatefulSet, which I had previously deployed using the in-tree VCP, are now visible in the vSphere client. Here are a few screenshots to show this visibility. First, we can see the 3 persistent volumes.

Next, if we click on the details icon, we can see some basic information about the volume, including which datastore it is provisioned onto:

If we select the Kubernetes objects view, we can see more detail about the K8s cluster and volume. This is vanilla Kubernetes, not one of the VMware Tanzu editions. We can also see the K8s namespace and any Pod(s) that are using the volume.

Lastly, we can see the physical placement. In this case, since it is vSAN datastore, we can see exactly how this volume is built (it is a RAID-1). So we have end-to-end visibility from K8s volume to vSphere volume to physical storage device.

Pretty neat!