Kubernetes, Hadoop, Persistent Volumes and vSAN

At VMworld 2018, one of the sessions I presented on was running Kubernetes on vSphere, and specifically using vSAN for persistent storage. In that presentation (which you can find here), I used Hadoop as a specific example, primarily because there are a number of moving parts to Hadoop. For example, there is the concept of Namenode and a Datanode. Put simply, a Namenode provides the lookup for blocks, whereas Datanodes store the actual blocks of data. Namenodes can be configured in a HA pair with a standby Namenode, but this requires a lot more configuration and resources, and introduces additional…

What’s in the vSphere and vSAN 6.7 release?

Today VMware unveils vSphere version 6.7, which also includes a new version of vSAN. In this post, I am going to highlight some of the big-ticket items that are in vSphere 6.7 from a core storage perspective, and also some of the new feature that you will find in vSAN 6.7. I’ll also cover some of the new enhancements coming in Virtual Volumes (VVols).

Getting started with Cloudera Hadoop on vSphere

This past week, my buddy Paudie and I have been neck-deep in Cloudera/Hadoop, with a view to getting it successfully deployed on vSphere. The purpose of this was solely a learning exercise, to try to understand what operational considerations need to be taking into account when running Hadoop on top vSphere. These operational considerations range from items such as maintenance mode, rack awareness, high availability, replication and protection of the data. Both Cloudera/Hadoop and vSphere offers ways to do all of this, so the longer term objective is to figure out whether or not these features are compatible, and whether…