Condusiv V-locity 4 – New caching feature

Cormac

11 years ago

I recently got hold of a copy of the new V-locity 4 product from Condusiv which was released last month. Condusiv is the new name for Diskeeper, whom you may have heard of before. I first came across them as a provider of software which specialized in optimizing I/O, primarily by preventing file fragmentation on NTFS in a Windows Guest OS. I blogged about them in the past on the vSphere Storage Blog after some discussions around defragmentation in the Guest OS. The new feature takes a portion of memory and uses it as a block cache. I did some preliminary tests with good ol’ IOmeter, and the initial results look quite good.

Disclaimer – the results shown here are for illustrative purposes only. I don’t have a production environment, this is simply using some equipment in my own lab. Nor am I in any way a performance guru. Condusiv have recommendations on how to correctly evaluate their V-locity 4 product which I will share with you shortly.

Test Environment

Two VMs running Windows 7, 1 vCPU, 2GB Memory. Each VM has a 32GB VMDK built on a local VMFS volume. The VMs are on dedicated ESXi 5.0 hosts with no other VMs running.

One of the VMs have Diskeeper V-locity 4 installed, the other does not.

Test 1 – IOMeter settings: 2 workers, 50,000 sectors, 1 outstanding I/O, 4KB, 100% Read, 0% Random.

IOmeter results from running above load on VM without V-locity:

This VM achieved about 15,500 read ops with 1 OIO. Now lets run the exact same test on the VM with V-locity 4 installed. The trick with V-locity & IOmeter is to let IOmeter run for a few minutes, stop it and allow V-locity’s algorithms to learn the data patterns, then restart IOmeter again. On the second start, performance improves dramatically.

IOmeter results from running above load on VM with V-locity:

With the same IOmeter settings, we have achieved twice as many read iops with the V-locity 4 product installed. Note that the % CPU is up at 100% now. The VM is now CPU bound rather than I/O bound. If we added more CPU resources to his VM, we would probably drive far more I/O.

Lets do another test, this time making half of the reads random. (BTW, I rebooted IOmeter before doing the next test – it can be a bit funky with its test results sometimes)

Test 2 – IOMeter settings: 2 workers, 50,000 sectors, 1 outstanding I/O, 4KB, 100% Read, 50% Random.

IOmeter results from running above load on VM without V-locity:

Not too different from the previous test. Let’s see the behaviour of the VM with V-locity and see whether random vs sequential has made much of a difference.

IOmeter results from running above load on VM with V-locity:

And again, very similar improvements observed.

Now, I am not going to go through all variations of IOmeter (actually, I’m not even sure this is the right tool for testing what is essentially cache). But from these very basic tests look, it would seem that the new V-locity 4 product is a VM accelerator of sorts. The benefits are cache are pretty self-explanatory. If reads can be satisfied from cache, this will obviously speed up performance. Also, if a good percentage of the I/O traffic comes from the cache, then there is more I/O bandwidth available to the underlying storage for I/Os that are not in cache.

Speaking with Spencer Allingham, the EMEA Technical Director for Condusiv, the proper way to evaluate the new features of this product would be to use the built-in Benefits Analyser. It runs over a 3 day period. The first day, it just monitors the Guest OS and provides no performance gain. The second day, V-locity would tune itself using the data that it had learned about on the first day, and on the third day, it provides the performance gains. What is really neat is that at the end of the third day, it will produce a report showing how much performance has been gained, and details how this has performance gain been achieved.

Spencer told me that the product has the ability to cache both proactively and reactively such that if it sees blocks being commonly accessed, it will ensure that they are loaded into the cache. In addition, it will use system monitoring to learn over time what blocks are used at certain times of the day, so that it can pre-load them ahead of time, so that they are ready for the time of the day when they are likely to be used.

One final item which Spencer mentioned is that aside from the caching, the IntelliWrite feature is still there. This feature is designed to help prevent Windows from splitting files up (fragmentation) as it is writing them to the NTFS volume. This in turn allows larger, more sequential I/Os, making I/O more efficient. However, in V-locity 4, Condusiv have made this feature smarter, and rather than wasting system resources by trying to aggregate all writes, V-locity 4 can now calculate the low performing fragment size, and thus it will only attempt to aggregate writes that would cause a performance loss if it is split up. If the file being written is split into large enough chunks so as not to cause a performance loss, then it will be left alone.

Sounds pretty good to me. You can get a free trial of V-locity 4 here.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan