This week I attended KubeCon and CloudNativeCon 2018 in Copenhagen. I had two primary goals during this visit: (a) find out what was happening with storage in the world of Kubernetes (K8s), and (b) look at how people were doing day 2 operations, monitoring, logging, etc, as well as the challenges one might encounter running K8s in production.
Let’s start with what is happening in storage. The first storage related session I went to was on Rook. This was a presentation by Jared Watts. According to Jared, the issues that Rook is trying to solve are to avoid vendor lock-in with storage and also address the issue of portability. If I understood correctly, Rook is about deploying, then provisioning distributed network storage such as CEPH and making it available to applications running on your K8s cluster. However, Rook only does the provisioning and management of storage for K8s – it is not in the data path itself.
It seems that one of the key features of Rook is the fact that it is implemented with K8s operators. This was part of the keynote in day #2, entitled Stateful Application Operators. What this basically means is that kubectl (command line interface for running commands against K8s clusters) can be extended through bespoke CRDs – Customer Resource Definitions – to do specific application related stuff. So, through Rook, when kubectl is asked to create a cluster/pool/storage object, the Rook operator is watching for these sorts of events. On receipt of an event like this, Rook communicates to the storage layer to instantiate the necessary storage components as well as talking to kubelet (agent that runs on each K8s node) to make sure the necessary persistent volume is created/accessible/mounted via the Rook Volume Plugin. If the underlying storage is CEPH, then the idea is to use kubectl to be able create any file stores or block stores and then be able to consume them within K8s. If I caught the drift of Jared’s session, once the operator is up and running in the K8s cluster, we can even get it to create the CEPH cluster in the first place.
Currently Rook supports CEPH but Jared also mentioned that integration with CockroachDB, Minio and Nexenta are in the works. There is a framework for other storage providers who want to integrate into K8s. Rook is currently in alpha state and is an inception level project at the Cloud Native Computing Foundation (CNCF). Find out more about Rook here – https://rook.io/
My second storage related session was the Storage SIG or Special Interest Group, to give it its full title. This was presented by Saad Ali @ Google. This session primarily focused on the Container Storage Interface (CSI) effort. Part of this project is focused on taking the 3rd party volume plugins out of the Kubernetes tree and create a separate volume plugin system instead. There are a number of reasons for this. Many 3rd parties in the storage space do not want to release their code as open source, nor do they want to be tied to K8s release cycles. However, Saad said that they will not deprecate the current volume plugins, but I guess this is something you will need to keep in mind as you move towards later versions of Kubernetes this year, and if you already using one of the volume plugins. A number of other storage projects were discussed, such as the ability to migrate and share data between K8s clusters, data gravity (don’t move the data to the pods but place pods on the same node/host as the data), as well as how to do volume snapshots and the ability to convert these snapshots to stand-alone volumes later on. Some of these projects are planned for later this year. A question was asked about the DELL-EMC initiative called REX-Ray, and how this compares to the CSI initiative. REX-Ray has now pivoted, according to Saad, to being a framework where storage vendors can develop their own CSI plugins with minimal code. If you’d like to be involved in the K8s Storage SIG, you can find the details here.
To finish on the storage aspect of the conference, we had a walk around the solutions exchange to see which storage vendors had a presence. We met the guys from Portworx. I also met them at DockerCon ’17 in Austin and wrote about them here. On asking what is new, they now have the ability to snapshot volumes belonging to an application that is across multiple containers. It seems that they can also encrypt and replicate at a container volume level. So, some nice enhancements since we last spoke. What I omitted to ask is whether they will need to change anything to align with the new CSI approach.
We also caught up with the StorageOS guys. They have both a CSI and in-tree drivers for storage. They are following along with the CSI designs. One thing they are waiting on is the outcome of how CSI will decide on how to do snapshots, and once that is understood, they plan to implement it. Good conversations all round.
Now it was the turn of monitoring. Basically, Prometheus is king of all things metric related in the world of Kubernetes. They were a bunch of different sessions dedicated to it. It seems that all applications (at least that is how it appeared to me) export their metrics in a format that Prometheus can understand. There was even a session Matt Layher @ Digital Ocean who explained how to export metrics from your app in a way that Prometheus could consume them. More on Prometheus here: https://prometheus.io/.
We met a number of companies in the solutions exchange who were focused on monitoring K8s. We had some good conversations with both LightStep and DataDog, and Google themselves had their own session talking about OpenCensus, which if I understood correctly, is a single set of libraries to allow metrics and traces to be captured on any application. When you are trying to track a request across multiple systems and/or across multiple micro-services, this becomes quite important. Morgan Mclean of Google stated that they are working on integration with different exporters to export these metrics and traces to, such as Zipkin, Jaeger, SignalFX and of course, Prometheus.
One interesting session that I attended was by Eduardo Silva @ Treasure Data. He talked us through how docker containers and Kubernetes both generate separate log streams, which you really need to unify to get the full picture of what is happening in your cluster. Eduardo introduced us to fluentd data collector, which is run as a daemon set on the cluster (daemon set is a special pod that runs on every node in the cluster). It pulls in the container logs (available from the file system / journald) and the K8s logs from the master node. Although we were caught for time, we were also introduced to fluentbit, a less memory intensive version of fluentd which also does log processing and forwarding. It has various application parsers, can exclude certain pods from logging, has enterprise connectors to the likes of Splunk and Kafka and can redirect its output to, you guessed it, Prometheus. More on fluentd here: https://www.fluentd.org/. More on fluentbit here: https://fluentbit.io/.
Having seen what people were doing in the metrics, tracing and monitoring space, it was also good to see some real life examples highlighting why this is so important. There were a number of sessions describing what could happen when things went wrong with K8s. During the closing keynote on day #1, Oliver Beattie of Monzo Bank in the UK described how a single API change between K8s 1.6 and 1.7 to handle a null reference for replicas led to an outage of over an hour at the bank. It was interesting to hear about the domino effect one minor change could have. On day #2, we heard from the guys at Oath, the digital content division of Verizon, including Yahoo and AOL. They discussed various issues they have had with K8s in production. I guess you could summarize this session along the lines of K8s has a lot of moving parts, and being slightly out of versions with different components can lead to some serious problems. Bugs are also an issue, as is human error. And of course, they shared how they were preventing these issues from happening again through the various guard-rails they were putting in place.
Among the other notable announcements, gVisor was one that caught my attention. This was announced by Aparna Sinha of Google. Aparna mentioned that one of the problems with containers is that they do not contain very well. To address this, they have developed gVisor. This is a very lightweight kernel that runs in user space of an OS. This will allow you to have sandbox’ed containers isolated by gVisor and still get the benefit of containers (resource sharing, quick start). The idea is that this will provide some sort of isolation which will prevent a container impacting the underlying node/host. More details can be found here: https://github.com/google/gvisor
Something else that caught my eye was Kata containers. It was tagged as the “speed of containers with the security of a VM”. This is essentially containers running as lightweight virtual machines on KVM. Although the project is managed by the OpenStack Foundation, they made it clear that it was just managed there and other than that, there was no other connection to OpenStack. To me, Kata containers did appear to have some similarities to vSphere integrated Containers from VMware. You can learn more about Kata here – https://katacontainers.io/ .
Both of these features (gVisor and Kata containers) would suggest that there are still many benefits to be gained from running containers in VMs which can provide advantages such as security and sandbox’ing over a bare-metal approach.
Lastly, it would be remiss of me if I didn’t mention the VMware/Kubernetes SIG (Special Interest Group). This is led by Steve Wong and Fabio Rapposelli. This is our forum for discussing items like best practices for running K8s on VMware. It is also where we outline our feature roadmap on where we plan to integrate. It is also where we look for input and feedback. It was emphasized very strongly that this was not just for vSphere. If you are running K8s on Fusion or Workstation, you are also most welcome to join. Check it out here.
Lots happening in this space, and lots of things to think about for sure as Kubernetes gains popularity. Thanks for reading.