A first look at SanDisk & FlashSoft
As part of my storage vendors to check out at VMworld 2013 in San Francisco, one of the vendors I really wanted to catch up with was SanDisk and to learn more about their FlashSoft product. FlashSoft are run as the software division of SanDisk. In August 2012, they released version 3.0 of their I/O acceleration software, compatible with vSphere 5.0. In April 2013, they released version 3.1 which works with vSphere 5.0 & 5.1. I caught up with a number of folks from the FlashSoft team at VMworld 2013 to learn more about their product, and what their plans were going forward.
I am sure that most readers are well aware of the “I/O blender” issue caused by having multiple VMs all sending I/O to the same datastore. Indeed, even if these VMs were all doing sequential I/O, it would still appear somewhat random when it reaches disk. This causes performance issues, and has basically led to a lot of interest in technologies like flash and SSD. FlashSoft from SanDisk is an I/O acceleration product which uses SSD as a cache to accelerate I/Os. Right now it is primarily a read accelerator – what is also known as a write thru cache. This means that writes must be committed to persistent media, but that write also goes into a cache so that future reads are accelerated (retrieving the block of data from SSD is faster than retrieving it from spinning disk). Let’s look at some of the details.
There are two components to FlashSoft. These are a VMkernel component and a CIM component. There are no in-guest drivers or agents that need to be installed. They also have a plug-in to vCenter allows the management of FlashSoft directly through a vCenter GUI.
The SSD device used for cache can be SAS, SATA or indeed PCIe. They format a raw SSD with log structured file system, writing data sequentially to the flash medium (which is optimal for solid-state devices) with variable block sizes. This help improve the performance and increase the lifespan of the SSD. FlashSoft can manage a maximum cache size of 2TB per host, and this can be used to accelerate up to 255 objects (VMDKs) per host. In fact, they also have snapshot support whereby you can select whether or not snapshots are accelerated.
FlashSoft does not have the concept of a flash pool. To create your 2TB of flash, you need to create a raid configuration of SSDs via the raid controller. This could be a nice feature to add perhaps.
FlashSoft can be enabled on-the-fly for already running VMs. And there is full vMotion support. However, the FlashSoft engineering team made the design decision to discard the cache when a vMotion operation is initiated. The reason for this is that they did not want the migration of the cache to impact the network or the time to do the vMotion operation. I suggested that they might want to give the customer a choice here – vMotion performance versus the need for the VM to rewarm its cache at the destination. Let’s see what happens.
Finally, there is a CLI interface, so rather than tagging 255 different objects with cache via the UI, you can actually create a script to do these sorts of tasks for you.
Sample Performance
At VMworld, they guys were running a demo at the booth. They were using IOmeter, and had a virtual machine which contained a working set size of 1GB, had 16 outstanding I/Os, used 2 VMDKs, and had 4 worker threads per VMDK.
The VM without FlashSoft achieved a maximum of 6,000 IOPS.
The VM with FlashSoft achieved a maximum of 100000 IOPS with acceleration.
Yes, I know IOmeter doesn’t simulate any typical workload and is a very synthetic type of performance measurement tool, but I guess it gives you some idea of what can be achieved with their product.
The SSD used was a Micron PCIe card and the IOMeter workload was a 100% random 100% reads.
Failure Scenarios
I finally asked the team about what happens when an SSD fails – how does this impact the VMs using cache? It seems that FlashSoft will switch into pass-thru mode and not use the SSD if an error is severe enough. While there will obviously be a performance hit with the loss of the cache, its great to know that your VMs will continue to run. Nice feature.
Virtual Machines that have FlashSoft configured can also be restarted with vSphere HA. However there doesn’t seem to be anyway to verify that a VM restarts on a host which has FlashSoft installed. Therefore there could be additional management overhead here, as once the VM is restarted, you may have to migrate it to a host with FlashSoft to get the caching feature back on the VM. Again, this might be a nice issue to address in a future release.
Future Plans
FlashSoft were kind enough to then discuss some road-map plans with me. In no particular order, this is what they want to implement in some future releases:
- Support for write-back, not just write-thru. This is obviously a considerable undertaking, as FlashSoft will need to ensure that a full copy of the data is still available, even when a host fails. They are looking at using pairs of hosts, logically grouped, so that each pair of hosts will mirror the cache. They called these HA pairs, but these are FlashSoft pairs, not to be confused which vSphere HA. FlashSoft already has write-back for Windows and Linux, so one assumes it won’t be too long before this makes into vSphere.
- They also want to introduce some quality of service (QoS) mechanisms. Since SSDs have varying levels of performance, FlashSoft plan to implement a mechanism which will allow VMs to be placed on SSDs with different specifications. This could be quite a nice feature.
- Another QoS feature that I know they are looking at is how to prioritize certain VMs when there is contention for the cache resource. Right now all VMs share the same SSD equally, and there is no way to prioritize some VMs over others.
- I mentioned that there is currently no ‘Flash pool’ mechanism and rely on volumes built via the RAID controller. SanDisk has already built a ‘Logical Volume Manager’ (LVM) that allows multiple SSD to be pooled together as a single logical volume in their Windows/Linux product. This LVM will be in the future vSphere product too, and is the basis on which the new QoS features are based.
- Finally, FlashSoft mentioned that they would also like to examine the virtual machine I/O, identify I/O patterns which would benefit from caching and enable caching only for those VMs which need it.
I thoroughly enjoyed my conversation with Serge, Manish and the folks at SanDisk. They took a lot of time to talk about what their product can do now, and what features and functionality they are adding going forward. Hopefully this post has given you an idea about some of the questions to ask SanDisk if you are speaking to them, but they are well worth checking out if you are in the market for some I/O accelerator products.
SanDisk and the FlashSoft team will be at VMworld EMEA. If you are attending VMworld 2013 in Barcelona, drop by their booth to learn more.
Nice write-up – your readers might also like the video of Brian Cox from SanDisk on theCUBE at Oracle Open World http://youtu.be/50Uch6oBUo0
Best Regards,
Stu Miniman (vExpert)
Wikibon.org
Twitter: @stu
Thanks Stu – I’m sure they would.