Kubernetes, Hadoop, Persistent Volumes and vSAN

At VMworld 2018, one of the sessions I presented on was running Kubernetes on vSphere, and specifically using vSAN for persistent storage. In that presentation (which you can find here), I used Hadoop as a specific example, primarily because there are a number of moving parts to Hadoop. For example, there is the concept of Namenode and a Datanode. Put simply, a Namenode provides the lookup for blocks, whereas Datanodes store the actual blocks of data. Namenodes can be configured in a HA pair with a standby Namenode, but this requires a lot more configuration and resources, and introduces additional…

VMworld 2018 vSAN Roundup – Monday, Aug 27th

VMworld is now officially underway, and as usual, day 1 is full of new announcements. vSAN is no exception. There have been announcements around the next release of vSAN (6.7U1), specific vSAN improvements to VMware Cloud on AWS, a Cloudera Hadoop validation on vSAN and a beta announcements. Since we’ve had quite a number of announcements,  I though I would try to capture them all in one place.

Getting started with Cloudera Hadoop on vSphere

This past week, my buddy Paudie and I have been neck-deep in Cloudera/Hadoop, with a view to getting it successfully deployed on vSphere. The purpose of this was solely a learning exercise, to try to understand what operational considerations need to be taking into account when running Hadoop on top vSphere. These operational considerations range from items such as maintenance mode, rack awareness, high availability, replication and protection of the data. Both Cloudera/Hadoop and vSphere offers ways to do all of this, so the longer term objective is to figure out whether or not these features are compatible, and whether…