vSphere 6.0 Storage Features Part 5: Virtual Volumes
I pushed this post out a bit as I know that there is a huge amount of information out there around virtual volumes already. This must be one of the most anticipated storage features of all time, with the vast majority of our partners ready to deliver VVol-Ready storage arrays once vSphere 6.0 becomes generally available. We’ve been talking about VVols for some time now. Actually, even I have been talking about it for some time – look at this tech preview that I did way back in 2012 – I mean, it even includes a video! Things have changed a bit since that tech preview was captured, so let’s see what Virtual Volumes 2015 has in store.
Much kudos to my good friend Paudie who did a lot of this research.
Virtual Volumes Terminology
Let’s begin with a discussion about the new terminology that VVols introduces:
- VASA provider, or a Virtual Volume Storage Provider (let’s call it the VP) is a software component that acts as a storage Awareness Service for vSphere and mediates out-of-band communication between vCenter and a storage system. The VP can take different forms; some array vendors have it embedded in the storage controller while others run it in an appliance. An administrator needs to add details of the VP to vCenter server. This is usually something as simple as providing a URL to the VP, along with some credentials. This information should come from the array vendor documentation.
- Protocol Endpoint is a logical I/O proxy presented to a host to communicate with Virtual Volumes stored on a Storage Container. When a virtual machine on the host performs an I/O operation,the protocol endpoint directs the I/O to the appropriate virtual volume. This is a LUN on block storage arrays, and a mount point on NAS arrays. These must be pre-configured on the array, and then presented to all the ESXi hosts that wish to use VVols. These are discovered ‘or’ mounted to the ESXi hosts just like a datastore. However they are not used for storage, just communication.
- Storage Container is a pool of raw storage capacity or an aggregation of storage capabilities that a storage system can provide to virtual volumes. It is not a LUN! However, this is where the Virtual Volumes are created.
- Storage Policy Based Management, through VM Storage Policies, is used for virtual machine provisioning to match storage capabilities to application requirements. The location, layout and storage capabilities of a VM depends on the storage policy associated with the VM.
- Virtual Volume Datastore is a vSphere representation of a Storage Container. When setting up Virtual Volumes, a Virtual Volume datastore is created to introduce the Storage Container to vSphere.
- Virtual Volumes (VVols) are stored natively inside a storage system that is connected through block or file protocols. They are exported as objects by a compliant storage system and are managed entirely by hardware on the storage array. Virtual Volumes are an encapsulation of virtual machine files, virtual disks, and their derivatives.
Configuration Steps for Virtual Volumes
In a nutshell, the following are the configuration steps required on vSphere to use Virtual Volumes:
- Add the VASA Provider to vCenter
- Discover or mount the PEs
- Create the Virtual Volume datastore
- Create VM Storage Policies
- Deploy VMs with a VM Storage Policy
Protocol Endpoint – detailed
In today’s LUN-Datastore vSphere environments, a datastore serves two distinct purposes – It serves as the access point for ESXi to send I/O to, and it also serves as storage container to store many virtual machine files (e.g. VMDKs). If we separate out the concept of the access point from the storage aspect, we can run with a fewer number of access points, each of which could refer to a number of storage entities. This is the purpose of the Protocol Endpoint. We can very many virtual volumes with just a few access points.
Protocol Endpoints are LUNs when the storage is block storage. Protocol Endpoints are mount-points when the storage is NAS/NFS. A PE is discovered/mounted in the same way as block and NAS datastores are discovered/mounted today.
Virtual Volumes are said to be bound to a Protocol Endpoint. When it comes to multipathing, an administrator only needs to set up multipathing and load balancing for the PE, and all the Virtual Volumes bound to that PE inherit the same multipathing and load balancing characteristics.
Support Protocols
VVols is supported with the following protocols:
- iSCSI
- NFS v3
- Fiber Channel
- FCoE (Fiber Channel over Ethernet
It is not supported with NFS v4.1, which is newly introduced in vSphere 6.0.
Storage Containers – detailed
The first thing to point out is that a storage container is not a LUN. It is a logical storage construct for the grouping of virtual volumes and is set up on the storage array by the storage administrator. In some respects, it can be thought of as a pool of storage on the array. The capacity of the container is based on physical storage capacity and there must be a minimum of at least one storage container per array. The maximum number of storage containers depends on the array. Multiple different storage containers can be used to logically partition or isolate VMs with diverse storage needs and requirements. A single storage container can be simultaneously accessed via multiple Protocol Endpoints.
When presented to ESXi hosts, the storage container appears as a VVol datastore.
Virtual Volumes – detailed
Those of you who are familiar with VSAN will already be familiar with virtual volumes in some respects. When we talk about virtual volumes or VVols, what we are basically talking about is the encapsulation of virtual machine files on hardware and exported as objects. Virtual Volumes are created when you perform Virtual Machine operation such as “Create a Virtual Machine”, “Power on a Virtual Machine” or “Clone or Snapshot a VM”. vCenter associates one or more Virtual Volumes to a Virtual Machine. A Virtual Machine, when deployed to a VVol datastore, can be though of as being composed of a number of Virtual Volumes.
Types of Virtual Volumes
Just like on VSAN, where each VM is made up of a set of objects, VMs deployed on a VVol datastore will be made up of a number of different VVols. The following are some typical VVols:
- Configuration Volume (Config-VVol) or HOME directory
- Represents a small directory that contains metadata about a VM
- .vmx files
- descriptor files
- log files
- Data Volume
- Corresponds to a Virtual Disk (e.g. VMDK)
- Swap Virtual Volume
- Contains Virtual Machine Swap file.
- Created when VM powered on.
- Clone/Snapshot Volumes
- Corresponds to a snapshot
- Other Virtual Volume
- Vendor Specific
- Storage array vendors may create their own VVols, for whatever reason
A note about queue depths
There has been some concern raised regarding queue depths and Virtual Volumes. Traditional LUNs and volumes typically do not have very large queue depths, so if there are a lot of VVols bound to a PE, doesn’t this impact performance? This is addressed in a number of ways. First, the array vendors are now free to choose any number of PEs to bind their VVols to (i.e. they have full control over the number of PEs deployed, which could be very many). Secondly, we are allowing for greater queue depth for PE LUNs to accommodate a possibly greater I/O density. However, considering that we already provide a choice regarding the number of PEs per storage container, and storage container size, this increased queue depth may not be relevant in many situations.
The role of policies
One thing to keep in mind is that SPBM, Storage Policy Based Management, plays a major role in virtual machine deployment. Once again, just like VSAN, VM deployment is policy driven. Capabilities are surfaced up to vSphere, the administrator builds policies with the capabilities, the policy is chosen when the VM is being created/deployed and the VM’s VVols are created in such a way so as to match the policy requirements.
These capabilities with vary from storage array vendor to storage array vendor, but think of capabilities like dedupe, compression, encryption, flash acceleration, etc, etc. There is no hard and fast list of VVol capabilities – it all depends on the array. If the array support the capability, then VVol can consume them. The VASA Provider, reference earlier is how these capabilities are exposed to vCenter, and this is all under the control of the array vendor.
Now these capabilities can be chosen on a per VM basis, and the resulting VVol will be placed on the appropriate storage container that can offer these capabilities. the policy can then be checked for compliance throughout the life-cycle of the VM, ensuring that the VM has the required storage feature set. When you hear VMware talking about Software Defined Storage, this is at its very core.
A final point is about the policies. Each policy can have multiple rule-sets, a different rule-set from a different vendor. If the rule-set related to one array cannot satisfy the requirements in the policy, then perhaps the rule-set from another vendor can. Of course, you will need multiple storage containers with different capabilities (or multiple VVol capable arrays from different vendors) for this to work, but hopefully you can see how powerful this feature is.
What about VAAI?
This was in fact one of my early questions – what does VVols means for VAAI? Let’s look at each of the VAAI primitives and discuss any differences. Remember that just like VAAI, the individual array vendors need to support the primitive for it to work with VVols.
ATS (Atomic Set and Test)
- Supported – There is a need to provide clustered file-system semantics and locking for the Config VVOL (VM home object). Therefore ATS is used for locking.
XCOPY (Cloning, Linked Clones)
- Supported – vSphere has the ability (via API calls) to instruct the array to clone an object (Virtual Volume) on our behalf
UNMAP
- Supported – Keep in mind that there is no file-system with VVols. Therefore any space in the storage container that is reserved by a Virtual Volume, can be automatically reclaimed by Storage Array upon the deletion of that VVol.
- [Update] On further discussion internally, this additional benefit of VVols was highlighted. Without VMFS as a layer in between, UNMAP commands generated by the Guest OS now go straight to the array. That means Windows Server 2012 (and, I understand Windows 7 as well) will immediately be sending UNMAPs to the storage (for block-based VVols).
Thin Provisoning Out of Space (OOS)
- Supported – The storage container ‘Out of Space’ warnings will be advertised to vSphere
A few additional notes about VAAI and VVols. A common question is whether or not VVols and LUNs/datastores from arrays that use VAAI can be presented/co-exist on the same ESXi host. The answer is absolutely, yes. In this case, should there be a request to clone a VMDK from datastore to VVol, VAAI will be used to clone from VAAI enabled datastore to a Virtual Volume
The other interesting point is around VAAI-NAS, which had a different set of primitives when compared to VAAI on block storage. VVols now levels the playing field. For example:
- NAS vendors are no longer required to write plug-ins. Array capabilities are simply advertised through the VASA API
- Historically, Storage VMotion operations on VMs with snapshots running on NFS were not offloaded. Virtual Volumes removes that limitation on NFS, allowing live migrations of snapshots to be offloaded to the array.
- VVol also brings *space efficient* Storage VMotion for an NFS (VVOL) based VM. It is now possible to determine the allocated (written-to) blocks and only migrate that data. Acquisition of the allocated blocks was not (and still is not) possible using traditional NFS.
- Conversely, when it came to Fast File Clone or Linked Clones (offload to native snapshots ), this was only available via VAAI-NAS primitives, not block. Virtual Volumes removes the NFS only restriction.
- [Update] The whole area of Storage VMotion offloads with VVols is quite detailed. Expect a new blog in this area shortly.
Conclusion
We’ve waited a long time for this feature. This is a game changer in the storage space in my humble opinion.
Stupid question but is it VVOLs or VVols? I did a search on VMware.com and 80 percent are VVOLs but the other 20% is VVols. aggggg.
I’ve noticed a few (mostly older) mentions of vVols as well.
Looks like the official name is VVols.
[Update] I’ve been told VVols or VVOLS – it doesn’t really matter. There is no official guidelines for acronyms, so knock yourselves out 🙂
What about thin provisioning on VVols ? Are there thin, eager and lazy provisioned VVols ? If so, does the array know about how is the VVol provisioned ?
Remember now that VM provisioning will be policy driven, so admins will place their request for thin, EZT or LZT in the policy.
This information is sent to the array (as part of the storage policy, and regardless of what the array advertises as its capabilities).
Some arrays may not support “Thick” operations, for instance, and they’ll reject it with OUT_OF_RESOURCE. Some arrays (e.g. NetApp) may need to be specifically configured to permit “Thick” storage allocation (which is, after all, expensive). etc. etc.
So I think we’ll we’ll see a few “gotchas” in this space.
Hi Cormac,
Good stuff as always.
I understand that the VAAI Block Zeroing primitive is no longer needed as it is VMFS that requested the zeroing and as there is no longer VMFS this does not apply.
So what happens when an EZT VMDK is created on a VVOL?
Does a VVOL enabled block array have to support Thin, Thick Reserved (LZT) and Thick (EZT) LUNs (another words it is becoming more like NFS)?
For products like NetApp FAS it would not make sense to use EZT as there would be no performance advantage.
For arrays that have a significant performance overhead when using thin LUNs it might make sense to use EZT.
This does all mean that with VVOLs we are putting the responsibility back on the storage array vendors to provide efficient thin provisioning – which I know many of them do not.
Also I have been following VVOLs for many years and now the detail is available it does not quite work how I expected:
1. It is wrong to say that it works like VSAN as with VSAN you have the ability to on the fly change an attribute on a VMDK (i.e. FTT and no doubt in the future de-duplication).
With VVOLs all you are doing is describing the characteristics of a pool and then placing VMDKs on a pool that meets those needs.
What if you want to change the characteristics (i.e. enable or disable de-duplication, or change a snapshot or replication schedule)?
For me granular VM management is the ability to on the fly change the characteristics of a VMDK as you can with VSAN and you do not appear to be able to do this with VVOLs which is disappointing.
2. I thought VVOLs would remove the needs for LUNs, but instead they are just hidden as a VVOL is in fact a LUN (on block storage)!!!
My understanding is based on the NetApp implementation (I have written more at http://blog.snsltd.co.uk/a-deeper-look-into-netapps-support-for-vmware-virtual-volumes/) and maybe other vendors have done it differently.
Your thoughts would be appreciated.
Many thanks
Mark
Mark,
” 1. It is wrong to say that it works like VSAN as with VSAN you have the ability to on the fly change an attribute on a VMDK (i.e. FTT and no doubt in the future de-duplication).”
NexGen is showing a policy being changed on the fly at 11:03 in the video at https://www.youtube.com/watch?v=OXGm5lq1WTY
Am I reading to much into your statement, or is your comment only regarding NetApp?
Hi Mark,
Sorry for the delay in responding – its sort of all hands to the pumps as we prep for 6.0 GA.
Anyway – I see Tom responded to one of your questions – thanks for jumping in here Tom.
You are correct – there is no filesystem layer such as VMFS anymore – everything is handed off to the array. So if you want thin or thick disks, it is up to the array to provide these. And no, there is no obligation on the array to support any of the data services. The array simply presents what it is capable of, and you as an admin must build the policies based on these capabilities, and choose the appropriate policy when deploying the VM.
I’m not going to comment on performance advantages of different arrays, etc. But we’ve been able to reserve space on NFS VMDKs thru VAAI-NAS primitives for some time. Whether you want to use it or not is up to you.
Yes – VVols is absolutely putting the onus back on the array vendors to not only provide efficient TP, but also efficient snapshots, dedupe, compression, cloning and eventually replication.
Regarding VVols and VSAN, there are subtle differences, but conceptually they will do the same thing and both fit into our policy driven, software defined storage vision.
VVOLs are “not” LUNs. The PEs are LUNs on block storage but the VVols are most definitely not.
Different vendors will have subtlety different implementations too – there are a number of different publications of VVols from the different vendors out there already. These might be worth reviewing if you wish to compare the NetApp implementation with others.
Hope this helps
Cormac
Hi Cormac,
From a licencing point of view I noticed that VAAI is included with the Enterprise edition whereas VVOLs is included with Standard.
I therefore assume that VVOLs does not have any dependencies on VAAI – is that correct?
You kind of would have thought that VVOLs, VASA, SPBM and VAAI would all be included in Standard edition now.
Seems a bit mean to keep VAAI in Enterprise edition and above.
Many thanks
Mark
I’ll admit that is does appear a little strange. However these type of decisions are beyond my control I’m afraid.
Hi Cormac,
Thanks for the great post!
Looking forward to seeing future articles on the website or possibly a whole section dedicated to VVOLs too (and possibly another book :O)
Cheers,
Mick.
Yea, Cormac how about another book!
I have a Question on Storage Container (SC). SC contains only Single raw storage or multiple groups of Raw Storage with Different capabilities which are then advertised by VASA Provider.
For example we have 2 VMs, one require FC Raid 5 and other require FC Raid 6. These VMs will be installed on 2 Different raw storage groups or Single storage group where hard disk can be shared between VMs as in VSAN where 2 VMs with different Stripe width ( For Example one 1 VM with 1 Stripe and second VM with Stripe) share the same Hard disk for the First stripe at least.
It will be great if you can write a blog post on Storage Container.There is lot of confusion on this.
A storage container (SC) can do both. An SC may represent one pool of storage with one set of capabilities or it may represent multiple pools of storage with different sets of capabilities. See my diagrams in the post.
Thanks for the reply Cormac. Just one more question in case SC contains multiple pools.Can these Pools be carved out from same set of Hard Disks or each Pool requires separate set of Hard Disks.
That is going to be completely dependent on the array vendor’s implementation Hari – it could vary.
Our two san’s and one nas are filled up with volumes on lun’s, and nfs datastores.
Supposing our san’s and nas are upgraded to support the PE etc, and we upgrade to vsphere 6. Would there be any path to migrate/upgrade an existing data infrastructure, or are we supposed to build this from scratch ?