With the announcements just made at VMworld 2015, the embargo on Virtual SAN 6.1 has now been lifted, so we can now discuss publicly some of the new features and functionality. Virtual SAN is VMware’s software-defined solution for Hyper-Converged Infrastructure (HCI). For the last number of months, I’ve been heavily involved in preparing for the Virtual SAN 6.1 launch. What follows is a brief description of what I find to be the most interesting and exciting of the upcoming features in Virtual SAN 6.1. Later on, I will follow-up with more in-depth blog posts on the new features and functionality.
I’ve been involved in a few conversations recently regarding how VSAN trace files are handled when the ESXi host that is participating in a VSAN cluster boots from a flash device. I already did a post about some of these considerations in the past, but focused mostly on USB/SD. However SATADOM was not included in this discussion, as we did not initially support SATADOM in VSAN 5.5, and only announced SATADOM support for VSAN 6.0. It seems that there are some different behaviors that need to be taken into account between the various flash boot devices, which is why I decided to write this post.
As usual, there have been loads of things happening over the last 12 months in the storage space. The Solutions Exchange at VMworld is always a great place to meet new storage startups, and get some further information on their respective products and innovations. This year, I’ve made a note of a few startups that I wish to catch up with at VMworld 2015 and find out what issues are they trying to address with their technology, and why should a customer choose their solution over some of the others in the storage space.
Disclaimer: Please note that I am not endorsing any of these vendors. This is simply technology that I am interested in, and something I want to learn more about at VMworld. I’d urge any readers attending VMworld to do the same. For those not attending, my goal is to learn enough about these new startups so that I can write an article about them at some point (if I haven’t already done so).
I had a query recently from a partner who was deploying VMware Horizon View 6.1 on top of an all-flash VSAN 6.0. They had done all the due diligence with configuring the AF-VSAN appropriately, marking certain flash devices as capacity devices, and so on. The configuration looked something like this:
The they went ahead and deployed Horizon View 6.1, which they had done many times before on hybrid configurations. They were able to successfully deploy full clone pools on the AF-VSAN, but hit a strange issue when deploying linked clone pools (floating/dedicated). The clone virtual machine operation would fail with an “Insufficient disk space on datastore” error, similar to the following:
A couple of months back, I wrote a short article on Rubrik. They were just coming out of stealth mode and had started an early access program. Since they had not officially launched, there wasn’t a lot that I was allowed to say about the company, other than give a high level overview. As they have now officially launched their r300 series of products, along with news of a massive $41 million Series B of funding, I can now share some additional details about their products and technology. Just to recap on what Rubrik do, they are offering a converged and scale-out backup software and backup storage appliance. The Rubrik appliance (Brik) is a “rack and go” architecture, with the ability to scale from three to thousands of nodes (unlimited) using industry standard 2U commodity appliance hardware.
The whole pitch is the idea that “backups suck”, and they want to give administrators a much better back and restore experience, similar to Apple’s ‘Time Machine’ feature.
With the release of VSAN 6.0, and the new all-flash configuration (AF-VSAN), I have received a number of queries around our 10% cache recommendation. The main query is, since AF-VSAN no longer requires a read cache, can we get away with a smaller write cache/buffer size?
Before getting into the cache sizing, it is probably worth beginning this post with an explanation about the caching algorithm changes between version 5.5 and 6.0. In VSAN 5.5, which came as a hybrid configuration only with a mixture of flash and spinning disk, cache behaved as both a write buffer (30%) and read cache (70%). If a read request was not satisfied by the cache, in other words there was a read cache miss, then the data block was retrieved from the capacity layer. This was an expensive operation, especially in terms of latency, so the guideline was to keep your working set in cache as much as possible. Since the majority of virtualized applications have a working set somewhere in the region of 10%, this was where the cache size recommendation of 10% came from. With hybrid, there is regular destaging of data blocks from write cache to spinning disk. This is a proximal algorithm, which looks to destage data blocks that are contiguous (adjacent to one another). This speeds up the destaging operations.
In Virtual SAN version 6.0, VMware introduced support for an all-flash VSAN. In other words, both the caching layer and the capacity layer could be made up of flash-based devices such as SSDs. However, the mechanism for marking some flash devices as being designated for the capacity layer, while leaving other flash devices as designated for the caching layer, is not at all intuitive at first glance. For that reason, I’ve included some steps here on how to do it.