If you were wondering why my blogging has dropped off in recent months, wonder no more. I’ve been fully immersed in the next release of VSAN. Today VMware has just announced the launch of VSAN 6.2, the next version of VMware’s Virtual SAN product. It is almost 2.5 years since we launched the VSAN beta at VMworld 2013, and almost 2 years to the day since we officially GA’ed our first release of VSAN way back in March 2014. A lot has happened since then, with 3 distinct releases in that 2 year period (6.0, 6.1 and now 6.2). For me the product has matured significantly in that 2 year period, with 3,000 customers and lots of added features. VSAN 6.2 is the most significant release we have had since the initial launch.
The following is by no means a comprehensive list of all of the new VSAN 6.2 features, but these are the major features, along with a few other features that I feel might be of interest to readers. In my opinion, we now have a feature complete product, and a world-class hyper-converged solution for any application. Read on to learn about the new features that we have added to this latest and greatest version of Virtual SAN.
This week Datrium announced that their DVX system is now generally available. I met these guys at VMworld 2015, and wrote a closer look at Datrium here. If you want a deeper dive into their solution, please read that post. But in a nutshell, their solution uses a combination of host side flash devices to accelerate read I/O, while at the same time writing to the Datrium hardware storage appliance (called a NetShelf). The NetShelf provides “cheap, durable storage that is easy to manage”. The DVX architecture presents the combined local cache/flash devices and NetShelf as a single shared NFS v3 datastore to your ESXi hosts.
The storage space has been a very exciting space over recent years. There have been so many new start-ups and new innovations, that it becomes difficult to keep track sometimes. More recently, there has been a lot of news around mergers, acquisitions and IPOs in the storage industry. It got me thinking about a lot of the changes we have seen over the past 3-4 years in the storage market. Just for my own interest, I went back over many of my blogs, and the various conversations I had with people at various VMworld events and VMUG meetings, and tried to see where a lot of these companies/products are now, and what they are currently doing. Now, I am not going to mention every single vendor here. I’m simply trying to highlight the ones that were acquired or merged or indeed IPO’ed (and in some cases are no longer with us) during this period.
I had this question a number of times now. Those of you familiar with VSAN will know that if a component goes absent for a period of 60 minutes (default) then VSAN will begin rebuilding a new copy of the component elsewhere in the cluster (if resources allow it). The question then is, if the missing/absent/failed component recovers and becomes visible to VSAN once again, what happens? Will we throw away the component that was just created, or will we throw away the original component that recovered?
Primary Data were one of the storage vendors that I wanted to catch up with at VMworld 2015. I was fortunate enough to meet with Graham Smith who is their Director of Virtualization Product Management. Graham gave me a demonstration of the Primary Data product in the Solutions Exchange at VMworld, and I also had an opportunity to visit their offices in Los Altos during a recent trip to the bay area and catch up once again with Graham and Kaycee Lai, SVP of Product Management & Sales at Primary Data. Before we get into the product and solution details, I wanted to go over a brief history of the company and the problem that they are trying to solve with their DataSphere Platform.
Many regular readers will know that we do not do read locality in Virtual SAN. For VSAN, it has always been a trade-off of networking vs. storage latency. Let me give you an example. When we deploy a virtual machine with multiple objects (e.g. VMDK), and this VMDK is mirrored across two disks on two different hosts, we read in a round-robin fashion from both copies based on the block offset. Similarly, as the number of failures to tolerate is increased, resulting in additional mirror copies, we continue to read in a round-robin fashion from each copy, again based on block offset. In fact, we don’t even need to have the VM’s compute reside on the same host as a copy of the data. In other words, the compute could be on host 1, the first copy of the data could be on host 2 and the second copy of the data could be on host 3. Yes, I/O will have to do a single network hop, but when compared to latency in the I/O stack itself, this is negligible. The cache associated with each copy of the data is also warmed, as reads are requested. The added benefit of this approach is that vMotion operations between any of the hosts in the VSAN cluster do not impact the performance of the VM – we can migrate the VM to our hearts content and still get the same performance.
So that’s how things were up until the VSAN 6.1 release. There is now a new network latency element which changes the equation when we talk about VSAN stretched clusters. The reasons for this change will become obvious shortly.
The more observant of you may have observed the following entry in the VSAN 6.1 Release Notes: Virtual SAN monitors solid state drive and magnetic disk drive health and proactively isolates unhealthy devices by unmounting them. It detects gradual failure of a Virtual SAN disk and isolates the device before congestion builds up within the affected host and the entire Virtual SAN cluster. An alarm is generated from each host whenever an unhealthy device is detected and an event is generated if an unhealthy device is automatically unmounted. The purpose of this post is to provide you with a little bit more information around this cool new feature.