Kubernetes Storage on vSphere 101 – StorageClass
In the first 101 post, we talked about persistent volumes (PVs), persistent volumes claims (PVCs) and PODs (a group of one or more containers). In particular, we saw how with Kubernetes on vSphere, a persistent volume is essentially a VMDK (virtual machine disk) on a datastore. In that first post, we created a static VMDK on a vSAN datastore, then built manifest files (in our case YAML) for a PV, a persistent volume claim (PVC) and finally a Pod, and showed how to map that static preexisting VMDK directly to the Pod, so that it could be mounted. We saw the VMDK/PV get attached, and we also saw it mounted within the container itself.
Now that was quite a tedious process. You really do not want to be in the business of creating individual VMDKs in vSphere, then creating multiple PV and PVC YAML manifest files to map each VMDK as a PV to the Pods/containers. What we want is something that is much more dynamic; in other words, we simply call out that the Pod or Pods require a volume of a specific size and a specific access mode, and have the dynamically PV provisioned. Fortunately, there is a way of doing this already in Kubernetes. This is where StorageClass comes in. Now, it is important to understand that while a StorageClass allows for the dynamic provisioning of PVs, it is also used to define different classes of storage. For example, if we think about vSAN, VMware’s hyperconverged storage offering that offers different capabilities of storage (RAID-1, RAID-5, RAID-6, Stripe Width, etc), one could create a bunch of different StorageClasses representing each of the different capabilities. As PVs are provisioned with different storage classes, they get instantiated as VMDKs with different capabilities on the underlying vSAN datastore. Let’s take a look at how we would use a StorageClass to dynamically provision storage with different capabilities that in this post.
To begin with, here is my very simple StorageClass definition. The only things of significant interest in this manifest YAML file are the parameters, which have the datastore and storagePolicyName. This storagePolicyName relates to a vSAN storage policy, implying that any PVs, instantiated as VMDKs on my vSAN datastore, will have the set of attributes defines in the policy “gold”. This policy has to be pre-created on the vSphere environment where you plan to create your VMDK/PV before it can be used.
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: demo-sc-vsan provisioner: kubernetes.io/vsphere-volume parameters: storagePolicyName: gold datastore: vsanDatastore
While vSAN has specific storage policy attributes, the tag feature of storage policy based management could be used with any vSphere datastore. At the moment, there are no other StorageClasses or PVs in the environment, so let’s go ahead and build this StorageClass.
$ kubectl get sc No resources found. $ kubectl get pv No resources found. $ kubectl create -f demo-sc.yaml storageclass.storage.k8s.io/demo-sc-vsan created $ kubectl get sc NAME PROVISIONER AGE demo-sc-vsan kubernetes.io/vsphere-volume 5s $ kubectl describe sc demo-sc-vsan Name: demo-sc-vsan IsDefaultClass: No Annotations: <none> Provisioner: kubernetes.io/vsphere-volume Parameters: datastore=vsanDatastore,storagePolicyName=gold AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none>
And just to show you that this Storage policy already exists in my vCenter server, here is a view of all of the existing storage policies via the vSphere client:
Now, with the StorageClass in place, there is no need to create individual PVs. All I need to do is create a PVC, persistent volume claim. This should dynamically instantiate a persistent volume (and dynamically create a VMDK on my vSphere storage) that has the attributes defined in the StorageClass so long as the PVC manifest YAML references the StorageClass. Let’s take a look at that PVC YAML next.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: demo-sc-pvc spec: storageClassName: demo-sc-vsan accessModes: - ReadWriteOnce resources: requests: storage: 2Gi
Nothing very new to report here, apart from the fact that it now references an actual StorageClass. Otherwise, it has much the same entries as the PVC YAML that we used in the first exercise. Let’s now create that PVC and see what happens.
$ kubectl get pv No resources found. $ kubectl get pvc No resources found. $ kubectl create -f demo-sc-pvc.yaml persistentvolumeclaim/demo-sc-pvc created $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE demo-sc-pvc Pending demo-sc-vsan 4s $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE demo-sc-pvc Bound pvc-4f76bf98-82f1-11e9-b153-005056a29b20 2Gi RWO demo-sc-vsan 10s $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-4f76bf98-82f1-11e9-b153-005056a29b20 2Gi RWO Delete Bound demo/demo-sc-pvc demo-sc-vsan 6s
By simply creating the PVC, we dynamically created a PV that is now available for any Pod that uses the PVC. Lets take a look at the YAML file for the Pod. Once again, it should look very similar to the previous example.
apiVersion: v1 kind: Pod metadata: name: demo-sc-pod spec: containers: - name: busybox image: "k8s.gcr.io/busybox" volumeMounts: - name: demo-vol mountPath: "/demo" command: [ "sleep", "1000000" ] volumes: - name: demo-vol persistentVolumeClaim: claimName: demo-sc-pvc
Let’s now go ahead and create our Pod to verify it can get access to this PV by using the PVC claim.
$ kubectl get pods No resources found. $ kubectl create -f demo-sc-pod.yaml pod/demo-sc-pod created $ kubectl get pod NAME READY STATUS RESTARTS AGE demo-sc-pod 0/1 ContainerCreating 0 4s $ kubectl get pod NAME READY STATUS RESTARTS AGE demo-sc-pod 1/1 Running 0 23s
If we now describe the Pod, we can see details of any Volumes and Mounts, including the PersistentVolumeClaim used:
$ kubectl describe pod demo-sc-pod Name: demo-sc-pod Namespace: demo Priority: 0 PriorityClassName: <none> Node: 19b9aeb4-78da-4631-8af9-38bc50d7fd29/10.27.51.191 Start Time: Thu, 30 May 2019 16:44:45 +0100 Labels: <none> Annotations: <none> Status: Running IP: 10.200.2.52 Containers: busybox: Container ID: docker://d6ae672ab1104dd872f7546e3a74691b8735e77f5d66af31864a087368df5a76 Image: k8s.gcr.io/busybox Image ID: docker-pullable://k8s.gcr.io/busybox@sha256:d8d3bc2c183ed2f9f10e7258f84971202325ee6011ba137112e01e30f206de67 Port: <none> Host Port: <none> Command: sleep 1000000 State: Running Started: Thu, 30 May 2019 16:45:04 +0100 Ready: True Restart Count: 0 Environment: <none> Mounts: /demo from demo-vol (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-pv9p8 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: demo-vol: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: demo-sc-pvc ReadOnly: false default-token-pv9p8: Type: Secret (a volume populated by a Secret) SecretName: default-token-pv9p8 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none>
And just like before, let’s log into the container and verify that the volume has been mounted successfully. LGTM. The volume /demo has been mounted using /dev/sdd, a ~2GB volume.
$ kubectl exec -it demo-sc-pod /bin/sh / # df -h Filesystem Size Used Available Use% Mounted on overlay 49.1G 5.2G 41.4G 11% / tmpfs 64.0M 0 64.0M 0% /dev tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup /dev/sdd 1.9G 3.0M 1.9G 0% /demo /dev/sda1 2.9G 1.6G 1.1G 60% /dev/termination-log /dev/sdc1 49.1G 5.2G 41.4G 11% /etc/resolv.conf /dev/sdc1 49.1G 5.2G 41.4G 11% /etc/hostname /dev/sda1 2.9G 1.6G 1.1G 60% /etc/hosts shm 64.0M 0 64.0M 0% /dev/shm tmpfs 7.8G 12.0K 7.8G 0% /tmp/secrets/kubernetes.io/serviceaccount tmpfs 7.8G 0 7.8G 0% /proc/acpi tmpfs 64.0M 0 64.0M 0% /proc/kcore tmpfs 64.0M 0 64.0M 0% /proc/keys tmpfs 64.0M 0 64.0M 0% /proc/timer_list tmpfs 64.0M 0 64.0M 0% /proc/sched_debug tmpfs 7.8G 0 7.8G 0% /proc/scsi tmpfs 7.8G 0 7.8G 0% /sys/firmware / #
One thing that might be worth pointing out is what has happened on the vSphere infrastructure level to make this happen. Kubernetes runs on vSphere as a set of virtual machines. One or more virtual machines have the roles of Kubernetes masters, and one or more virtual machines have the role of worker nodes. Pods are typically deployed on the worker nodes. When this Pod was created, a master would have scheduled that Pod to run on one of the workers. The VMDK would then have been dynamically created on my vSphere storage and then attached to that worker node VM, and made available to the Pod as a PV. If you examine the worker node on which the Pod has been scheduled (information about the node on which the Pod is scheduled is available in the describe output of the Pod – see above), you can see that the PV attached and mounted to the Pod has indeed been attached to that worker node VM as a VMDK.
What if we want to deploy another Pod which has storage with similar attributes? Here you will need to create another PVC, since two containers are not able to share this PV since the volume accessModes is set to ReadWriteOnce in the PVC manifest YAML file. If you try to launch another Pod with the same PVC, you will get an error similar to the following in the Pod event logs:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 10s default-scheduler Successfully assigned demo/demo-sc-pod-b to 5a3f930d-9da2-4648-b262-8b5b739abf89 Warning FailedAttachVolume 10s attachdetach-controller Multi-Attach error for volume "pvc-4f76bf98-82f1-11e9-b153-005056a29b20" Volume is already used by pod(s) demo-sc-pod
Great content thanks for sharing this informative blog which provided me technical information keep posting.Awesome..You have clearly explained …Its very useful for me to know about new things..Keep on blogging..Well Said, you have furnished the right information that will be useful to anyone at all time. Thanks for sharing your Ideas.
Great article. Any idea how to get controller manager to trust my vCenter server’s certificate? I am seeing this in the logs.
nodemanager.go:392] Cannot connect to vCenter with err: Post https://vcenterserver.com:443/sdk: x509: certificate signed by unknown authority
I’ve not seen this with vCenter Aaron, but I have seen x509 issues with integrating my K8s deployment with Harbor.
The issue was when my K8s nodes tried to pull image from the Harbor repo. This failed with x509 errors. To rectify that issue, I had to place the CA cert pulled down from Harbor into the /etc/docker/certs.d/harbor.rainpole.com/ on each of my nodes. Perhaps you need something similar, but for VC?