Kubernetes Storage on vSphere 101 – StorageClass

In the first 101 post, we talked about persistent volumes (PVs), persistent volumes claims (PVCs) and PODs (a group of one or more containers). In particular, we saw how with Kubernetes on vSphere, a persistent volume is essentially a VMDK (virtual machine disk) on a datastore. In that first post, we created a static VMDK on a vSAN datastore, then built manifest files (in our case YAML) for a PV, a persistent volume claim (PVC) and finally a Pod, and showed how to map that static preexisting VMDK directly to the Pod, so that it could be mounted. We saw the VMDK/PV get attached, and we also saw it mounted within the container itself.

Now that was quite a tedious process. You really do not want to be in the business of creating individual VMDKs in vSphere, then creating multiple PV and PVC YAML manifest files to map each VMDK as a PV to the Pods/containers. What we want is something that is much more dynamic; in other words, we simply call out that the Pod or Pods require a volume of a specific size and a specific access mode, and have the dynamically PV provisioned. Fortunately, there is a way of doing this already in Kubernetes. This is where StorageClass comes in. Now, it is important to understand that while a StorageClass allows for the dynamic provisioning of PVs, it is also used to define different classes of storage. For example, if we think about vSAN, VMware’s hyperconverged storage offering that offers different capabilities of storage (RAID-1, RAID-5, RAID-6, Stripe Width, etc), one could create a bunch of different StorageClasses representing each of the different capabilities. As PVs are provisioned with different storage classes, they get instantiated as VMDKs with different capabilities on the underlying vSAN datastore. Let’s take a look at how we would use a StorageClass to dynamically provision storage with different capabilities that in this post.

To begin with, here is my very simple StorageClass definition. The only things of significant interest in this manifest YAML file are the parameters, which have the datastore and storagePolicyName. This storagePolicyName relates to a vSAN storage policy, implying that any PVs, instantiated as VMDKs on my vSAN datastore, will have the set of attributes defines in the policy “gold”. This policy has to be pre-created on the vSphere environment where you plan to create your VMDK/PV before it can be used.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: demo-sc-vsan
provisioner: kubernetes.io/vsphere-volume
parameters:
  storagePolicyName: gold
  datastore: vsanDatastore

While vSAN has specific storage policy attributes, the tag feature of storage policy based management could be used with any vSphere datastore. At the moment, there are no other StorageClasses or PVs in the environment, so let’s go ahead and build this StorageClass.

$ kubectl get sc
No resources found.

$ kubectl get pv
No resources found.

$ kubectl create -f demo-sc.yaml
storageclass.storage.k8s.io/demo-sc-vsan created

$ kubectl get sc
NAME           PROVISIONER                    AGE
demo-sc-vsan   kubernetes.io/vsphere-volume   5s

$ kubectl describe sc demo-sc-vsan
Name:                  demo-sc-vsan
IsDefaultClass:        No
Annotations:           <none>
Provisioner:           kubernetes.io/vsphere-volume
Parameters:            datastore=vsanDatastore,storagePolicyName=gold
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>

And just to show you that this Storage policy already exists in my vCenter server, here is a view of all of the existing storage policies via the vSphere client:

Now, with the StorageClass in place, there is no need to create individual PVs. All I need to do is create a PVC, persistent volume claim. This should dynamically instantiate a persistent volume (and dynamically create a VMDK on my vSphere storage) that has the attributes defined in the StorageClass so long as the PVC manifest YAML references the StorageClass. Let’s take a look at that PVC YAML next.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: demo-sc-pvc
spec:
  storageClassName: demo-sc-vsan
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Nothing very new to report here, apart from the fact that it now references an actual StorageClass. Otherwise, it has much the same entries as the PVC YAML that we used in the first exercise. Let’s now create that PVC and see what happens.

$ kubectl get pv
No resources found.

$ kubectl get pvc
No resources found.

$ kubectl create -f demo-sc-pvc.yaml
persistentvolumeclaim/demo-sc-pvc created

$ kubectl get pvc
NAME          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
demo-sc-pvc   Pending                                      demo-sc-vsan   4s

$ kubectl get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
demo-sc-pvc   Bound    pvc-4f76bf98-82f1-11e9-b153-005056a29b20   2Gi        RWO            demo-sc-vsan   10s

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS   REASON   AGE
pvc-4f76bf98-82f1-11e9-b153-005056a29b20   2Gi        RWO            Delete           Bound    demo/demo-sc-pvc   demo-sc-vsan            6s

By simply creating the PVC, we dynamically created a PV that is now available for any Pod that uses the PVC. Lets take a look at the YAML file for the Pod. Once again, it should look very similar to the previous example.

apiVersion: v1
kind: Pod
metadata:
  name: demo-sc-pod
spec:
  containers:
  - name: busybox
    image: "k8s.gcr.io/busybox"
    volumeMounts:
    - name: demo-vol
      mountPath: "/demo"
    command: [ "sleep", "1000000" ]
  volumes:
    - name: demo-vol
      persistentVolumeClaim:
        claimName: demo-sc-pvc

Let’s now go ahead and create our Pod to verify it can get access to this PV by using the PVC claim.

$ kubectl get pods
No resources found.

$ kubectl create -f demo-sc-pod.yaml
pod/demo-sc-pod created

$ kubectl get pod
NAME          READY   STATUS              RESTARTS   AGE
demo-sc-pod   0/1     ContainerCreating   0          4s

$ kubectl get pod
NAME          READY   STATUS    RESTARTS   AGE
demo-sc-pod   1/1     Running   0          23s

If we now describe the Pod, we can see details of any Volumes and Mounts, including the PersistentVolumeClaim used:

$ kubectl describe pod demo-sc-pod
Name:               demo-sc-pod
Namespace:          demo
Priority:           0
PriorityClassName:  <none>
Node:               19b9aeb4-78da-4631-8af9-38bc50d7fd29/10.27.51.191
Start Time:         Thu, 30 May 2019 16:44:45 +0100
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.200.2.52
Containers:
  busybox:
    Container ID:  docker://d6ae672ab1104dd872f7546e3a74691b8735e77f5d66af31864a087368df5a76
    Image:         k8s.gcr.io/busybox
    Image ID:      docker-pullable://k8s.gcr.io/busybox@sha256:d8d3bc2c183ed2f9f10e7258f84971202325ee6011ba137112e01e30f206de67
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      1000000
    State:          Running
      Started:      Thu, 30 May 2019 16:45:04 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /demo from demo-vol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pv9p8 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  demo-vol:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  demo-sc-pvc
    ReadOnly:   false
  default-token-pv9p8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pv9p8
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

And just like before, let’s log into the container and verify that the volume has been mounted successfully. LGTM. The volume /demo has been mounted using /dev/sdd, a ~2GB volume.

$ kubectl exec -it demo-sc-pod /bin/sh

/ # df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                  49.1G      5.2G     41.4G  11% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                     7.8G         0      7.8G   0% /sys/fs/cgroup
/dev/sdd                  1.9G      3.0M      1.9G   0% /demo
/dev/sda1                 2.9G      1.6G      1.1G  60% /dev/termination-log
/dev/sdc1                49.1G      5.2G     41.4G  11% /etc/resolv.conf
/dev/sdc1                49.1G      5.2G     41.4G  11% /etc/hostname
/dev/sda1                 2.9G      1.6G      1.1G  60% /etc/hosts
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                     7.8G     12.0K      7.8G   0% /tmp/secrets/kubernetes.io/serviceaccount
tmpfs                     7.8G         0      7.8G   0% /proc/acpi
tmpfs                    64.0M         0     64.0M   0% /proc/kcore
tmpfs                    64.0M         0     64.0M   0% /proc/keys
tmpfs                    64.0M         0     64.0M   0% /proc/timer_list
tmpfs                    64.0M         0     64.0M   0% /proc/sched_debug
tmpfs                     7.8G         0      7.8G   0% /proc/scsi
tmpfs                     7.8G         0      7.8G   0% /sys/firmware
/ #

One thing that might be worth pointing out is what has happened on the vSphere infrastructure level to make this happen. Kubernetes runs on vSphere as a set of virtual machines. One or more virtual machines have the roles of Kubernetes masters, and one or more virtual machines have the role of worker nodes.  Pods are typically deployed on the worker nodes. When this Pod was created, a master would have scheduled that Pod to run on one of the workers. The VMDK would then have been dynamically created on my vSphere storage and then attached to that worker node VM, and made available to the Pod as a PV. If you examine the worker node on which the Pod has been scheduled (information about the node on which the Pod is scheduled is available in the describe output of the Pod – see above), you can see that the PV attached and mounted to the Pod has indeed been attached to that worker node VM as a VMDK.

What if we want to deploy another Pod which has storage with similar attributes? Here you will need to create another PVC, since two containers are not able to share this PV since the volume accessModes is set to ReadWriteOnce in the PVC manifest YAML file. If you try to launch another Pod with the same PVC, you will get an error similar to the following in the Pod event logs:

Events:
  Type     Reason              Age   From                     Message
  ----     ------              ----  ----                     -------
  Normal   Scheduled           10s   default-scheduler        Successfully assigned demo/demo-sc-pod-b to 5a3f930d-9da2-4648-b262-8b5b739abf89
  Warning  FailedAttachVolume  10s   attachdetach-controller  Multi-Attach error for volume "pvc-4f76bf98-82f1-11e9-b153-005056a29b20" Volume is already used by pod(s) demo-sc-pod
If you wanted to have multiple Pods share the same PV, you would need to use a ReadWriteMany volume, and this, from what I am hearing in the wider community, would typically be an NFS share.
However, you can create multiple PVCs against the same storage class, and you can create multiple StorageClasses with different datastores and/or different policies for vSphere storage, and instantiate PVs/VMDKs with multiple different attributed, all dynamically provisioned through the use of StorageClasses and PVCs.
OK – so now we have seen how to create a statically provisioned PV in the previous post, and a dynamically provisioned PV in this post. You must now be asking how does this work for large scale applications that require persistent storage. For example, if I had some NoSQL database application deployed in Kubernetes, do I really need to create a new PVC every time I want to add a new Pod with its own PV to this application? The answer is no. Kubernetes has other objects which will take care of this for you, such as Deployments and StatefulSets. That will be the subject of our next topic.
Manifests used in this demo can be found on my vsphere-storage-101 github repo.

3 Replies to “Kubernetes Storage on vSphere 101 – StorageClass”

  1. Great content thanks for sharing this informative blog which provided me technical information keep posting.Awesome..You have clearly explained …Its very useful for me to know about new things..Keep on blogging..Well Said, you have furnished the right information that will be useful to anyone at all time. Thanks for sharing your Ideas.

  2. Great article. Any idea how to get controller manager to trust my vCenter server’s certificate? I am seeing this in the logs.

    nodemanager.go:392] Cannot connect to vCenter with err: Post https://vcenterserver.com:443/sdk: x509: certificate signed by unknown authority

    1. I’ve not seen this with vCenter Aaron, but I have seen x509 issues with integrating my K8s deployment with Harbor.

      The issue was when my K8s nodes tried to pull image from the Harbor repo. This failed with x509 errors. To rectify that issue, I had to place the CA cert pulled down from Harbor into the /etc/docker/certs.d/harbor.rainpole.com/ on each of my nodes. Perhaps you need something similar, but for VC?

Comments are closed.