Introducing vSphere Cloud Native Storage (CNS)

I’m delighted to be able to share with you that, coinciding with the release of vSphere 6.7 U3, VMware have also announced Cloud Native Storage (CNS). CNS builds on the legacy of the earlier vSphere Cloud Provider (VCP) for Kubernetes, and along with a new release of the Container Storage Interface (CSI) for vSphere and Cloud Provider Interface (CPI) for vSphere, CNS aims to improve container volume management and provide deep insight into how container applications running on top of vSphere infrastructure are consuming the underlying vSphere Storage. Now, there may be a lot of unfamiliar terminology in that opening paragraph, so let’s try to give a brief introduction about where Kubernetes and vSphere integration started, and how it got to where it is today.

VMware’s first container storage project – Hatchway

Project Hatchway offered container environments a way to consume vSphere storage infrastructure, from hyper-converged infrastructure (HCI) powered by VMware vSAN to traditional SAN (VMFS) and NAS (NFS) storage. There were two distinct parts to the project initially – one focusing on docker container volumes and the other focusing on Kubernetes volumes. Both aimed to dynamically provision VMDKs on vSphere storage to provide a persistent storage solution for applications running in a Container Orchestrator on vSphere. The VCP is still found in many K8s distributions running on vSphere, but having a volume driver embedded into core Kubernetes code became problematic for a number of reasons, and led to the creation of an out-of-tree driver specification called CSI, the Container Storage Interface. The VCP will eventually be phased out of Kubernetes, in favour of the out-of-tree vSphere CSI.

Introducing CSI, the Container Storage Interface

Kubernetes, in its early days, introduced the concept of volumes to provide persistent storage to containerized applications. When a Pod was provisioned, a Persistent Volume (PV) could be connected to the Pod. The internal “Provider” code would provision the necessary storage on the underlying infrastructure, e.g. vSphere and the VCP. However as K8s popularity rose, more and more “providers” were added in-tree. This became unwieldy, making it unsustainable to introduce provider enhancements outside of K8s releases. This led to the creation of the Container Storage Interface (CSI), effectively an API between container orchestrators and storage providers to allow consistent interoperability. Eventually all in-tree storage “providers” will be remove from Kubernetes and replaced with an out-of-tree “plugin” format. CSI implements all the volume life-cycle tasks (i.e. create, attach, detach, delete, mount, unmount). VMware has rewritten the VCP to adhere to the CSI specification, and when coupled with the VMware CNS feature (which we will see shortly), provides huge advantages when running and monitoring Kubernetes application consuming persistent storage on vSphere infrastructure.

Introducing CPI, the Cloud Provider Interface

One should note that the CSI does not completely replace the functionality provided by the VCP. Thus, another interface has been developed called the Cloud Provider Interface. This has also been referred to as the Cloud Controller Manager (CCM) in the past. As you can imagine, there are numerous ‘cloud’ providers offering Kubernetes, both in public clouds and on-premises in a private clouds. Since the underlying infrastructure of each cloud is different, it was decided that some of the tasks (control loops) that were previously handled by the core Kubernetes controller should also be moved out of core Kubernetes and into a CPI plugin format, as many of these control loops were cloud provider specific. For the vSphere CPI, it is the node control loops that are of interest. These do a number of tasks such as Initialize a node with cloud specific zone/region labels and other cloud specific instance details such as type and size. It does a number of other tasks as well, but CSI in conjunction with CPI allows us to do intelligent placement of Pods and PVs on vSphere infrastructure (across Datacenters, Clusters, Hosts, etc) .

VMware Cloud Native Storage (CNS)

Now that we have discussed CSI and CPI, we come to the VMware Cloud Native Storage initiative. VMware CNS is focused on managing and monitoring container volume life-cycle operations on vSphere storage. In a nutshell, Cloud Native Storage is a control plane that manages life-cycle of container volumes on vSphere, independent of virtual machines and container pod life-cycles. CNS enables health monitoring, policy compliance and visibility of a volume’s metadata (name, cluster name, labels, etc.) through the vSphere UI. It gives a vSphere administrator key insights into vSphere storage resource consumption by containerized applications running Kubernetes on vSphere infrastructure.

CNS Key Components

CNS is already fully integrated into vSphere – there is nothing additional to install. Its purpose is to co-ordinate persistent volume operations (create, attach, detach, delete, mount, unmount) on Kubernetes nodes (which are virtual machines on vSphere), as requested by the CSI driver. Persistent Volumes are instantiated on vSphere storage as First Class Disks, appearing as standard VMDKs on the vSphere storage. What this means is that certain operations, such as snapshot create/delete, backup, restore can now be done on a per Kubernetes persistent volume basis, allowing Kubernetes applications running on vSphere to leverage traditional protection methods that have been available to virtual machines for many years. This is what we mean when we say that the PV life-cycle is outside of the VM or Pod life-cycle; in the past, to snapshot a VMDK, one would have to snapshot the whole of the VM. If this VM is a Kubernetes node, it could be running multiple container based applications as well as having PVs/VMDKs from many applications attached. CNS now facilitates life-cycle operations at the container/Persistent Volume granularity.

From a visibility perspective, the various Kubernetes volume metadata is stored in a local CacheDB on the vCenter server for swift rendering in the vSphere UI. Kubernetes Storage classes can leverage vSphere storage policies (SPBM), allowing attributes of the underlying vSphere storage to be assigned to the volumes, as well as continuously checking the volume for policy compliance. Should the underlying vSphere infrastructure use VMware vSAN, CNS is also integrated into parts of the vSAN UI.

A simplified diagram of the key components of CNS are shown here.

It should be noted that in this first version of CNS, we only offer visibility of block volumes (PVs/VMDKs), but those block volumes can be provisioned on VMFS, NFS or vSAN datastores. Visibility of File Shares is something that we are looking at for a future release.

Kubernetes Volumes visible in the vSphere UI

By way of showing you what sort of information CNS can make visible in the vSphere UI, I borrowed a couple of screenshots from our technical marketing team (thanks guys). Below we have selected a cluster object in the vSphere inventory. Judging by the names of the virtual machines, we can assume that there is a Kubernetes cluster deployed as a set of VMs on the cluster. If we now navigate to the Cluster > Monitor > Cloud Native Storage > Container Volumes view, we can see that the container provider is Kubernetes, and that there is a list of volumes (PVs/VMDKs/FCDs) deployed on a vSAN datastore. The volume name displayed here is exactly the same name used for the volume in Kubernetes, making it easy for the vSphere admin and the Dev-Ops/Site Reliability Engineer to have a sane dialog regarding storage. We can also see that whatever policy was set in the StorageClass in Kubernetes for these persistent volumes is also compliant. There are a number of additional columns that are not shown, such as name of the storage policy, volume capacity, etc. By clicking on the icon in the second column of the volume, you will be shown a list of Kubernetes objects associated with the volume, such as cluster name, persistent volume claim name and namespace, etc. Also of interest is the fact that for volumes deployed on a vSAN datastore, a click of the volume name will take you to the layout of the object and display the various components that back it. You can also filter displayed volumes based on label identifiers if you want to see only the volumes associated with a particular container application.

The advantage here for vSphere administrators is that if their Kubernetes customers/end-users highlight any storage related issues, it should be extremely easy to map the objects at the Kubernetes layer to vSphere objects, and speed up any monitoring/troubleshooting needed to resolve the issue at hand.

I did want to share one additional screen shot from our technical marketing team, although it is not specifically a CNS enhancement. This is a screenshot showing vSAN Capacity Usage. In vSphere 6.7U3 this has been improved to include the storage consumption of block container volumes which are deployed as First Class Disks on the vSAN datastore. A full list of vSAN 6.7 U3 improvements can be found here.

And that concludes my introduction to CNS. As you can see, it will offer vSphere admins really good insight into how Kubernetes applications are consuming vSphere resources from a storage perspective. Coupled with vSAN and storage policies, vSphere offers an excellent infrastructure platform for your Kubernetes deployments. For more information on CNS, check out this great Virtual Blocks blog post by my good friend Myles Gray. And for a great real-life use case, read how VMware IT used Essential PKS with CNS on vSAN for their cloud native journey.

Survey

One last item. As we continue our efforts around cloud native storage, we are conducting a small survey to further understand container storage use-cases. If you could kindly spend a few minutes completing this survey, we would be very grateful. Survey Link: https://cns-storage.surveyanalytics.com

10 Replies to “Introducing vSphere Cloud Native Storage (CNS)”

JR Irwin says:

August 16, 2019 at 11:19 pm

Hi Comac – In on the relatively recent interviews you did alongside Duncan Epping, I thought I heard you or Duncan say 7 out of 10 of the most popular containerized apps require and end up using stateful sets. I can’t seem to find the video now but… did I hear that correctly and if so, can you please share where that data is coming from? Thank you!!
-Jonas
1. Cormac says:
  
  August 19, 2019 at 10:11 am
  
  Yep – it came from a survey by DataDog JR. https://www.datadoghq.com/docker-adoption/ Item #6 in their list.
daswas says:

September 6, 2019 at 7:54 am

Hi, how can I use the container cloud store with docker swarm..
1. Cormac says:
  
  September 6, 2019 at 9:17 am
  
  Unsure. Let me try to find out.
Rolf Bartels says:

September 12, 2019 at 2:49 pm

Hi Cormac
Is the a guide available to setting this up, I have configured the cloud providers and the PVCs are being created, but the volumes are not showing up under the Container Volumes view in the vSphere UI.
1. Cormac says:
  
  September 12, 2019 at 4:08 pm
  
  Yes there is Rolf. Check out the tutorials here: https://github.com/kubernetes/cloud-provider-vsphere/blob/master/docs/book/README.md
  
  The K8s cluster must use the CSI and CPI drivers, not the earlier built-in VCP.
Doug Bernhardt says:

September 18, 2019 at 10:29 pm

How do you tie the volume name back to the files associated with it in a Datastore? I see an FCD folder in the datastore, but I can’t seem to tie those files back to the volume name.
1. Cormac says:
  
  September 19, 2019 at 12:36 pm
  
  Hi Doug – that’s a great question. Unfortunately, that is not available, and it is a feature I would like to see added.
  
  To help me persuade our product managers and engineers, can you share the use case where you require to have the path to the actual volume?
  
  In the meantime, I knocked out some PowerCLI + kubectl scripts to help you find this info. Its not the most performant script, but it might address your need – https://github.com/cormachogan/vtopology
tirelibirefeTirelibirefe says:

November 8, 2019 at 7:54 pm

Hello Cormac;
I have been struggling with CPI for couple of days but I hit to the wall today. vSphere CPI cannot provision volumes as I use tag based SBPM policy. It’s sad that I don’t have enough resource to have a vsan environment.

Is there any ıpdate about the bug “vSphere Cloud Provider fails to provision volumes when using tag based SPBM policy #75040 / https://github.com/kubernetes/kubernetes/issues/75040?

…maybe you advise me a workaround.

Thanks
1. Cormac says:
  
  November 11, 2019 at 8:40 am
  
  We have looked at the issue above, and this seems to be reported against the VCP driver, and not the new CSI/CPI combination. It seems there is a known issue with the VCP driver used in PKS.
  
  Could you open a new issue against the CSI/CPI and upload a new log bundle so our engineers can take a look? Thanks