A closer look at vSphere Flash Read Cache – vFRC

vsphere5.5bI was going to make this part 11 of my vSphere 5.5 Storage Enhancements series, but I thought that since this is such a major enhancement to storage in vSphere 5.5, I’d put a little more focus on it. vFRC, short for vSphere Flash Read Cache, is a mechanism whereby the read operations of your virtual machine are accelerated by using an SSD or a PCIe flash device to cache the disk blocks of the application running in the Guest OS of your virtual machine. Now, rather than going to magnetic disk to read a block of data, the data can be retrieved from a flash cache layer to improve performance and lower latency. This is commonly known as write-through cache, as opposed to write-back cache, where the write operation is acknowledged when the block of data enters the cache layer.

Components

Flash Pool. A flash pool, as you might suspect, is a pool of flash devices on an ESXi host which can be ‘owned’ by vFRC for I/O acceleration purposes. When vFRC creates a flash pool, what it actually does is creates a VMware proprietary filesystem call VFFS on the flash device. When read cache is later allocated to virtual machine disks, files are created on VFFS corresponding to the amount of cache that has been allocated to the virtual machine. Read operations from the virtual machine and then directed to the cache file on VFFS, and since this is on a flash device, successful cache hits should accelerate I/O.

Flash Cache Module. There is a second component which forms part of vSphere Flash Read Cache called the Flash Cache Module. In vSphere 5.5, the name of this module is vFC. This is a loadable kernel module and its purpose is to cache the data that is read by the Guest OS. vFRC uses an adaptive caching mechanism (ARC) which is similar in many respects to an LRU (least recently used) approach, but ARC tries to identify long-term data and keep that data in cache since there is a higher likelihood that it will be reused. There is no need to install any new VIBs, patches or agents to use vFRC – the required components are built into vSphere 5.5.

Configuration

Since this feature is only available in vSphere 5.5 right now. It isn’t available in earlier versions of vSphere. And it can only be configured by the vSphere web client, no the older C# client. There are a couple of key steps to configuring vFRC.

Step 1 – Configure the Flash Pool

This is where you decide which flash devices, be they SSD drives or PCIe flash devices, are added to your flash pool. In my example, I am using a Fusion-io PCIe flash device. This step builds the VFFS on the flash device. You can have a maximum of 8 flash devices to the flash pool. Once the flash pool is created, it may be carved up into chunks of cache for accelerating the I/O of your virtual machines. The maximum size of a flash device in vSphere 5.5 is 4TB and the maximum size of the flash pool is 32TB, so theoretically you could have 8 x 4TB flash devices in a flash pool. And you can have only one VFFS per ESXi host. Space on VFFS is only consumed by virtual machines  when they are powered on. When the virtual machines are powered off, their previously consumed space is put back into the flash pool. This screenshot shows a sample flash pool configuration:

3. Resources AddedStep 2 – Create flash cache object

Now we come to creating cache files on the VFFS. These cache files are created in the “Edit Settings > Virtual Hardware > Disk File” of the virtual machine. A number of considerations need to be taken into account before building the flash cache object. For instance, you need to consider how big to make it. If you’re not sure, a good rule of thumb is to set the cache size to be about 10% of the size of your VMDK. This rule of thumb was arrived at by monitoring virtualized applications, and it was observed that a typical virtualized application has roughly 10% active data. Of course, not all applications are the same and others will have more live data, others will have less. You would need to profile your own applications to arrive at an exact figure, but this 10% should get your started.

You also need to consider the block size of the cache file, often referred to as the cache line size. This should be made to be the same size as the most commonly used block of the application running inside the Guest OS. How do you figure that out? Well, a good way to do this is to use a command that is readily available on the ESXi host called vscsistats. This will display the most commonly used block sizes of a virtual machine, and based on this output you can set the block size accordingly. Here is an article I wrote previously on vscsistats to get you started. The block size can be configured from 4KB to 1024KB.

11. vFRC block size optionsBy default, the maximum reservation of cache that can be allocated to a VMDK is 400GB of cache per VMDK file. To disable vFRC on a virtual machine, simply enter 0 in the reservation field.

Step 3 (Optional) – Create Host Swap Cache

Many of you may be aware of the swap-to-SSD functionality that was in earlier versions of vSphere. The purpose of this feature is to allow over-committed ESXi hosts to swap to an SSD rather than swapping to magnetic disk. This provided some level of performance improvement on ESXi hosts which were already degraded. With vSphere 5.5, this feature has been renamed to Host Swap Cache, but more importantly, it can leverage vFRC resources to do its swapping. In other words, an SSD or PCIe flash device used for VFFS can also be used by Host Swap Cache. However this is an optional step and not necessary for a virtual machine to use vFRC. The screenshot below show the wizard for configuring Host Swap Cache. The maximum value shown is the size of the VFFS.

5. Host Swap CacheInteroperability

Snapshots: Now that out virtual machine disk has some flash cache, let’s take a look at what happens to the cache associated with a VMDK when various operations take place. The first of these is a snapshot operation. What happens to the cache when a virtual machine has got vFRC configured and a snapshot is taken? In this case, cache is preserved. However, if the virtual machine reverts to a previous snapshot, the current cache contents (cache file) are discarded and a new cache file is built on VFFS.

FSR: What about a Fast Suspend/Resume (FSR) operation which might occur (such as hot-adding hardware to the VM)? With this operation, the cache is also preserved.

VM Suspend/Resume: What happens to the cache when a virtual machine  is suspended and then resumed? In this case the original cache file is discarded and a new cache file is built on resume/power-on.

Power Off: A power off or restart operation on the virtual machine will also discard the current cache contents (cache file) and build a new cache file on VFFS.

vMotion: vFRC is fully integrated with vMotion, and new vMotion workflows will prompt administrators what to do with the cache in the event of a live migration. One option is to migrate the cache – the other option is to drop the cache. There are pros and cons to both approaches of course. If you choose to migrate the cache along with the data, you will of course have a ‘warmed’ cache of this virtual machine on its destination host. However it will take longer to migrate that virtual machine. If you elect to drop the cache, the cache will be instantiated on the destination host, but it will be cold. So while the operation will be quicker, the performance of the virtual machine may be below standard until the cache has sufficiently warmed.

14. vmotion-vflash-optionsOne other thing with vMotion; you will only be able to select a destination ESXi host that also has vFRC enabled. You will not be able to migrate the virtual machine to a host without vFRC.

13. vMotion-vflash62TB VMDK: vSphere 5.5 also introduced the much larger 62TB VMDKs (more about that here). While the maximum amount of vFRC that can be attached to a VMDK is 400GB, it is still supported to use vFRC with the much larger VMDK size in vSphere 5.5.

vFRC also works with vSphere clustering technologies like DRS & vSphere HA.

DRS: DRS will work with virtual machines with vFRC configured, but will try to avoid moving these virtual machines as much as possible. In other words, vSphere DRS won’t automatically migrate virtual machines with vFRC unless a host is placed into maintenance mode or there is a major imbalance in the cluster and the only way it can resolve it is to move the virtual machines with vFRC configured. This can be considered a soft affinity between a VM which has vFRC and its host.

vSphere HA: vSphere HA can protect virtual machines with vFRC configured. In the event of a host failure, the virtual machine is started on another host in the cluster, and the cache required by the virtual machine is allocated from the new host’s flash pool. Obviously the cache is cold to begin with and will need to warm over time for performance to return to optimal levels. Therefore you will have to ensure that all hosts in the cluster support vFRC and have it configured, or the virtual machines with vFRC will not restart if a host failure occurs.

If you are looking for further information, Duncan has put together a really good FAQ on vFRC here and Rawlinson also made a pretty good FAQ here. There is also a very good performance white paper on vFRC linked from this blog post here.

8 thoughts on “A closer look at vSphere Flash Read Cache – vFRC

  1. Does this VFRC provide the same functionality to any VMware virtual machine (regardless of the product that created it) as the CBRC (content based read cache) did for VMware View virtual machines?

    Or is it a read-optimized (for now) version of something like pernix data’s FVP?

    Thanks for the extensive article – I am asking more from a product positioning standpoint

    • Yes – there is no dependency on anything running is the Guest OS, so it can accelerate the reads of any virtual machine (this is dependent on the application benefiting from accelerated reads of course)

  2. What happends if the SSD disk or disks under VFFS filesystem fails, will the VM/VMDK continue to run without the cache or will it halt ?

    • In this case, the reads will have to come from magnetic disk, and the vFRC is bypassed. You will see a performance degradation of course, since this VM is effectively running with vFRC now.