Step 1 – Setup the Key Provider
This is a prerequisite. You cannot do encryption on vSphere unless you have a Key Provider configured. I am not going to go through the steps here but if you search this site, you will find examples of how to setup a HyTrust KMIP and establish trust between vCenter Server and the HyTrust KMIP. Once this has been successfully configured, you should see something similar to the following in vCenter > Configure > Security > Key Providers view:
Step 2 – Create an Encryption Policy
To provision a persistent volume with encryption, a storage policy with encryption capabilities needs to be created. Since VM encryption is a “Host based service“, you must select “Enable host based rules‘ to configure it. If you wanted the encrypted Persistent Volume to be placed on a certain storage, you could include datastore specific rules. Below you can see rules for vSAN storage, a Pure Storage vVol capable FlashArray and if you wanted the volume to be placed on either VMFS or NFS, you could use tag based placement rules. I’m not going to select any datastore in this example, as I don’t really care about which storage the Persistent Volume is placed, so long as it is encrypted. In production, you will probably combine this host based rule for encryption with a datastore specific rule.
On the following screen, select ‘Use storage policy component‘ and set it to Default encryption policies. This will automatically pick up the Key Provider already configured and trusted for this vCenter Server. The custom option can be used to decide when IO Filters are applied in the IO path, either before or after encryption. There are some products and features that sit in the IO path that are implemented as IO Filters. This is not applicable in my setup.
After the storage policy has been created, it can be reviewed.
Step 3 – Test policy with a full VM deployment
I normally ensure that a fully deployed virtual machine works with encryption before moving onto testing with individual Kubernetes Persistent Volumes. After deploying a full virtual machine with the encryption policy, edit the settings of the virtual machine in the vSphere UI, and you should see encryption highlighted for both the VM configuration files and VMDK, the virtual machine disk(s). You can now be confident that the encryption policy is working, and move onto testing encryption of Kubernetes Persistent Volumes with the new CSI driver in vSphere 7.0.
Step 4 – Create StorageClass, PersistentVolumeClaim and Pod
This test will involve creating a StorageClass that contains the new vSphere CSI driver as the provider as well a policy that use encryption. Also, the test will involve creating a simply PVC to dynamically provision the Persistent Volume (PV). Finally, we will create a Pod and ensure that the encrypted volume can be successfully attached to, and mounted in the Pod. Here are the manifest files that I plan to use for such a test. This is the StorageClass manifest referencing the encryption storage policy.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: encrypt-sc provisioner: csi.vsphere.vmware.com parameters: storagepolicyname: "Simple-Encryption"
This is the PersistentVolumeClaim (PVC) manifest, referencing the StorageClass.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: encrypt-pvc spec: storageClassName: encrypt-sc accessModes: - ReadWriteOnce resources: requests: storage: 2G
This is the Pod manifest (a simple busybox), referencing the PVC.
apiVersion: v1 kind: Pod metadata: name: encrypt-pod-a spec: containers: - name: encrypt-pod-a image: "k8s.gcr.io/busybox" volumeMounts: - name: encrypt-vol mountPath: "/mnt/volume1" command: [ "sleep", "1000000" ] volumes: - name: encrypt-vol persistentVolumeClaim: claimName: encrypt-pvc
In my first test run, the creation of the StorageClass and PVC both went fine. However, when I tried to deploy the Pod, the following errors were thrown when I did a describe of the Pod:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <invalid> default-scheduler Successfully assigned default/encrypt-pod-a to k8s-worker07-01 Warning FailedAttachVolume <invalid> (x3 over <invalid>) attachdetach-controller AttachVolume.Attach failed for volume "pvc-77d3bf1c-1ef4-40a3-8608-dfc6a9a3ea57" : rpc error: code = Internal desc = Failed to attach disk: "449cc99f-6cda-48e0-84f7-b7b150d4ab0a" with node: "k8s-worker07-01" err failed to attach cns volume: "449cc99f-6cda-48e0-84f7-b7b150d4ab0a" to node vm: "VirtualMachine:vm-1081 [VirtualCenterHost: 10.27.51.106, UUID: 42051f46-6ac5-3b3a-502b-8242b0325b9d, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: 10.27.51.106]]". fault: "(*types.LocalizedMethodFault)(0xc00000cde0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) <nil>,\n Reason: (string) (len=92) \"CNS: Failed to attach disk when calling AttachDisk:Fault cause: vim.fault.InvalidDeviceSpec\\n\"\n },\n LocalizedMessage: (string) (len=108) \"CnsFault error: CNS: Failed to attach disk when calling AttachDisk:Fault cause: vim.fault.InvalidDeviceSpec\\n\"\n})\n". opId: "0eac10c4"
OK – that’s not very clear what the problem is. Fortunately the message displayed in the vSphere UI soon afterwards is a lot clearer about what the actual issue is. The PV cannot be attached to a virtual machine which itself is not encrypted. And this is true – I did not encrypt my Kubernetes worker nodes.
Step 5 – Encrypt VM, add as K8s nodes and re-test
To complete my test, I am going to add another worker node VM to my Kubernetes cluster which is fully encrypted. In fact, I am going to use the VM that I tested for encryption in step 3. I will then cordon/drain my other 2 Kubernetes nodes so that they are no longer used for scheduling Pods. This means that when I create my busybox Pod with encrypted PVC, it should get scheduled on the new ‘encrypted’ Kubernetes worker node as it is the only node available to schedule Pods. There are other ways to achieve this of course, through labeling of the encrypted K8s node and using a nodeSelector to match the node label in the Pod. But for the purposes of this test, I will use drain to disable scheduling on the other nodes. In the output below, you can observe the new VM joining the Kubernetes cluster as a worker node.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master07-01 Ready master 2d1h v1.16.3 k8s-worker07-01 Ready <none> 47h v1.16.3 k8s-worker07-02 Ready <none> 47h v1.16.3 $ kubectl get nodes NAME STATUS ROLES AGE VERSION encrypted-ubuntu NotReady <none> 32s v1.16.3 k8s-master07-01 Ready master 2d1h v1.16.3 k8s-worker07-01 Ready <none> 47h v1.16.3 k8s-worker07-02 Ready <none> 47h v1.16.3 $ kubectl get nodes NAME STATUS ROLES AGE VERSION encrypted-ubuntu Ready <none> 53s v1.16.3 k8s-master07-01 Ready master 2d1h v1.16.3 k8s-worker07-01 Ready <none> 47h v1.16.3 k8s-worker07-02 Ready <none> 47h v1.16.3
The next step is to drain the two unencrypted workers so that only the encrypted worker node is available for scheduling. Don’t worry about the errors to the drain command; the workers continue to run Pods for the Kubernetes system but will not allow new Pods the be scheduled on them.
$ kubectl drain k8s-worker07-01 node/k8s-worker07-01 cordoned error: unable to drain node "k8s-worker07-01", aborting command... There are pending nodes to be drained: k8s-worker07-01 error: the server could not find the requested resource: kube-flannel-ds-amd64-rvzrw, kube-proxy-cdns5, vsphere-csi-node-jsv84 $ kubectl drain k8s-worker07-02 node/k8s-worker07-02 cordoned error: unable to drain node "k8s-worker07-02", aborting command... There are pending nodes to be drained: k8s-worker07-02 error: the server could not find the requested resource: kube-flannel-ds-amd64-nswzk, kube-proxy-mkq6d, vsphere-csi-node-9tcd6 $ kubectl get nodes NAME STATUS ROLES AGE VERSION encrypted-ubuntu Ready <none> 81s v1.16.3 k8s-master07-01 Ready master 2d1h v1.16.3 k8s-worker07-01 Ready,SchedulingDisabled <none> 47h v1.16.3 k8s-worker07-02 Ready,SchedulingDisabled <none> 47h v1.16.3
Now we can go ahead and deploy the busybox Pod once more and see if the encrypted PV will attach and mount. Since the encrypted K8s worker node is the only one available, the Pod should get scheduled on that node, and thus the PV should also get attached to that node, since it is also encrypted. We can verify that both the PVC and PV are still present from the previous test, which means we only need to deploy the Pod.
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-77d3bf1c-1ef4-40a3-8608-dfc6a9a3ea57 2Gi RWO Delete Bound default/encrypt-pvc encrypt-sc 18m $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE encrypt-pvc Bound pvc-77d3bf1c-1ef4-40a3-8608-dfc6a9a3ea57 2Gi RWO encrypt-sc 18m $ kubectl apply -f encrypt-pod.yaml pod/encrypt-pod-a created $ kubectl get pods NAME READY STATUS RESTARTS AGE encrypt-pod-a 0/1 ContainerCreating 0 6s $ kubectl get pods NAME READY STATUS RESTARTS AGE encrypt-pod-a 1/1 Running 0 18s
And now it appears that the Pod with the encrypted PV has started successfully. As a final check, we can look at the Pod’s events. This time things look much together.
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <invalid> default-scheduler Successfully assigned default/encrypt-pod-a to encrypted-ubuntu Normal SuccessfulAttachVolume <invalid> attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-77d3bf1c-1ef4-40a3-8608-dfc6a9a3ea57" Normal Pulling <invalid> kubelet, encrypted-ubuntu Pulling image "k8s.gcr.io/busybox" Normal Pulled <invalid> kubelet, encrypted-ubuntu Successfully pulled image "k8s.gcr.io/busybox" Normal Created <invalid> kubelet, encrypted-ubuntu Created container encrypt-pod-a Normal Started <invalid> kubelet, encrypted-ubuntu Started container encrypt-pod-a
That concludes the post. This is another really cool new feature in the CSI driver with vSphere 7.0. If the contents of the Persistent Volumes that you are attaching to your Kubernetes Pods needs to be encrypted, this is something that you can now leverage if you run your Kubernetes clusters on top of vSphere 7.0 storage. Note however that the Kubernetes worker nodes must also be encrypted for this to be successful.