There are two components to FlashSoft. These are a VMkernel component and a CIM component. There are no in-guest drivers or agents that need to be installed. They also have a plug-in to vCenter allows the management of FlashSoft directly through a vCenter GUI.
FlashSoft does not have the concept of a flash pool. To create your 2TB of flash, you need to create a raid configuration of SSDs via the raid controller. This could be a nice feature to add perhaps.
FlashSoft can be enabled on-the-fly for already running VMs. And there is full vMotion support. However, the FlashSoft engineering team made the design decision to discard the cache when a vMotion operation is initiated. The reason for this is that they did not want the migration of the cache to impact the network or the time to do the vMotion operation. I suggested that they might want to give the customer a choice here – vMotion performance versus the need for the VM to rewarm its cache at the destination. Let’s see what happens.
Finally, there is a CLI interface, so rather than tagging 255 different objects with cache via the UI, you can actually create a script to do these sorts of tasks for you.
Sample Performance
At VMworld, they guys were running a demo at the booth. They were using IOmeter, and had a virtual machine which contained a working set size of 1GB, had 16 outstanding I/Os, used 2 VMDKs, and had 4 worker threads per VMDK.
The VM without FlashSoft achieved a maximum of 6,000 IOPS.
The VM with FlashSoft achieved a maximum of 100000 IOPS with acceleration.
Yes, I know IOmeter doesn’t simulate any typical workload and is a very synthetic type of performance measurement tool, but I guess it gives you some idea of what can be achieved with their product.
The SSD used was a Micron PCIe card and the IOMeter workload was a 100% random 100% reads.
Failure Scenarios
I finally asked the team about what happens when an SSD fails – how does this impact the VMs using cache? It seems that FlashSoft will switch into pass-thru mode and not use the SSD if an error is severe enough. While there will obviously be a performance hit with the loss of the cache, its great to know that your VMs will continue to run. Nice feature.
Virtual Machines that have FlashSoft configured can also be restarted with vSphere HA. However there doesn’t seem to be anyway to verify that a VM restarts on a host which has FlashSoft installed. Therefore there could be additional management overhead here, as once the VM is restarted, you may have to migrate it to a host with FlashSoft to get the caching feature back on the VM. Again, this might be a nice issue to address in a future release.
Future Plans
FlashSoft were kind enough to then discuss some road-map plans with me. In no particular order, this is what they want to implement in some future releases:
- Support for write-back, not just write-thru. This is obviously a considerable undertaking, as FlashSoft will need to ensure that a full copy of the data is still available, even when a host fails. They are looking at using pairs of hosts, logically grouped, so that each pair of hosts will mirror the cache. They called these HA pairs, but these are FlashSoft pairs, not to be confused which vSphere HA. FlashSoft already has write-back for Windows and Linux, so one assumes it won’t be too long before this makes into vSphere.
- They also want to introduce some quality of service (QoS) mechanisms. Since SSDs have varying levels of performance, FlashSoft plan to implement a mechanism which will allow VMs to be placed on SSDs with different specifications. This could be quite a nice feature.
- Another QoS feature that I know they are looking at is how to prioritize certain VMs when there is contention for the cache resource. Right now all VMs share the same SSD equally, and there is no way to prioritize some VMs over others.
- I mentioned that there is currently no ‘Flash pool’ mechanism and rely on volumes built via the RAID controller. SanDisk has already built a ‘Logical Volume Manager’ (LVM) that allows multiple SSD to be pooled together as a single logical volume in their Windows/Linux product. This LVM will be in the future vSphere product too, and is the basis on which the new QoS features are based.
- Finally, FlashSoft mentioned that they would also like to examine the virtual machine I/O, identify I/O patterns which would benefit from caching and enable caching only for those VMs which need it.
I thoroughly enjoyed my conversation with Serge, Manish and the folks at SanDisk. They took a lot of time to talk about what their product can do now, and what features and functionality they are adding going forward. Hopefully this post has given you an idea about some of the questions to ask SanDisk if you are speaking to them, but they are well worth checking out if you are in the market for some I/O accelerator products.
SanDisk and the FlashSoft team will be at VMworld EMEA. If you are attending VMworld 2013 in Barcelona, drop by their booth to learn more.