Much kudos to my good friend Paudie who did a lot of this research.
Virtual Volumes Terminology
Let’s begin with a discussion about the new terminology that VVols introduces:
- VASA provider, or a Virtual Volume Storage Provider (let’s call it the VP) is a software component that acts as a storage Awareness Service for vSphere and mediates out-of-band communication between vCenter and a storage system. The VP can take different forms; some array vendors have it embedded in the storage controller while others run it in an appliance. An administrator needs to add details of the VP to vCenter server. This is usually something as simple as providing a URL to the VP, along with some credentials. This information should come from the array vendor documentation.
- Protocol Endpoint is a logical I/O proxy presented to a host to communicate with Virtual Volumes stored on a Storage Container. When a virtual machine on the host performs an I/O operation,the protocol endpoint directs the I/O to the appropriate virtual volume. This is a LUN on block storage arrays, and a mount point on NAS arrays. These must be pre-configured on the array, and then presented to all the ESXi hosts that wish to use VVols. These are discovered ‘or’ mounted to the ESXi hosts just like a datastore. However they are not used for storage, just communication.
- Storage Container is a pool of raw storage capacity or an aggregation of storage capabilities that a storage system can provide to virtual volumes. It is not a LUN! However, this is where the Virtual Volumes are created.
- Storage Policy Based Management, through VM Storage Policies, is used for virtual machine provisioning to match storage capabilities to application requirements. The location, layout and storage capabilities of a VM depends on the storage policy associated with the VM.
- Virtual Volume Datastore is a vSphere representation of a Storage Container. When setting up Virtual Volumes, a Virtual Volume datastore is created to introduce the Storage Container to vSphere.
- Virtual Volumes (VVols) are stored natively inside a storage system that is connected through block or file protocols. They are exported as objects by a compliant storage system and are managed entirely by hardware on the storage array. Virtual Volumes are an encapsulation of virtual machine files, virtual disks, and their derivatives.
Configuration Steps for Virtual Volumes
In a nutshell, the following are the configuration steps required on vSphere to use Virtual Volumes:
- Add the VASA Provider to vCenter
- Discover or mount the PEs
- Create the Virtual Volume datastore
- Create VM Storage Policies
- Deploy VMs with a VM Storage Policy
Protocol Endpoint – detailed
In today’s LUN-Datastore vSphere environments, a datastore serves two distinct purposes – It serves as the access point for ESXi to send I/O to, and it also serves as storage container to store many virtual machine files (e.g. VMDKs). If we separate out the concept of the access point from the storage aspect, we can run with a fewer number of access points, each of which could refer to a number of storage entities. This is the purpose of the Protocol Endpoint. We can very many virtual volumes with just a few access points.
Protocol Endpoints are LUNs when the storage is block storage. Protocol Endpoints are mount-points when the storage is NAS/NFS. A PE is discovered/mounted in the same way as block and NAS datastores are discovered/mounted today.
Virtual Volumes are said to be bound to a Protocol Endpoint. When it comes to multipathing, an administrator only needs to set up multipathing and load balancing for the PE, and all the Virtual Volumes bound to that PE inherit the same multipathing and load balancing characteristics.
VVols is supported with the following protocols:
- NFS v3
- Fiber Channel
- FCoE (Fiber Channel over Ethernet
It is not supported with NFS v4.1, which is newly introduced in vSphere 6.0.
Storage Containers – detailed
The first thing to point out is that a storage container is not a LUN. It is a logical storage construct for the grouping of virtual volumes and is set up on the storage array by the storage administrator. In some respects, it can be thought of as a pool of storage on the array. The capacity of the container is based on physical storage capacity and there must be a minimum of at least one storage container per array. The maximum number of storage containers depends on the array. Multiple different storage containers can be used to logically partition or isolate VMs with diverse storage needs and requirements. A single storage container can be simultaneously accessed via multiple Protocol Endpoints.
When presented to ESXi hosts, the storage container appears as a VVol datastore.
Virtual Volumes – detailed
Those of you who are familiar with VSAN will already be familiar with virtual volumes in some respects. When we talk about virtual volumes or VVols, what we are basically talking about is the encapsulation of virtual machine files on hardware and exported as objects. Virtual Volumes are created when you perform Virtual Machine operation such as “Create a Virtual Machine”, “Power on a Virtual Machine” or “Clone or Snapshot a VM”. vCenter associates one or more Virtual Volumes to a Virtual Machine. A Virtual Machine, when deployed to a VVol datastore, can be though of as being composed of a number of Virtual Volumes.
Types of Virtual Volumes
Just like on VSAN, where each VM is made up of a set of objects, VMs deployed on a VVol datastore will be made up of a number of different VVols. The following are some typical VVols:
- Configuration Volume (Config-VVol) or HOME directory
- Represents a small directory that contains metadata about a VM
- .vmx files
- descriptor files
- log files
- Data Volume
- Corresponds to a Virtual Disk (e.g. VMDK)
- Swap Virtual Volume
- Contains Virtual Machine Swap file.
- Created when VM powered on.
- Clone/Snapshot Volumes
- Corresponds to a snapshot
- Other Virtual Volume
- Vendor Specific
- Storage array vendors may create their own VVols, for whatever reason
A note about queue depths
There has been some concern raised regarding queue depths and Virtual Volumes. Traditional LUNs and volumes typically do not have very large queue depths, so if there are a lot of VVols bound to a PE, doesn’t this impact performance? This is addressed in a number of ways. First, the array vendors are now free to choose any number of PEs to bind their VVols to (i.e. they have full control over the number of PEs deployed, which could be very many). Secondly, we are allowing for greater queue depth for PE LUNs to accommodate a possibly greater I/O density. However, considering that we already provide a choice regarding the number of PEs per storage container, and storage container size, this increased queue depth may not be relevant in many situations.
The role of policies
One thing to keep in mind is that SPBM, Storage Policy Based Management, plays a major role in virtual machine deployment. Once again, just like VSAN, VM deployment is policy driven. Capabilities are surfaced up to vSphere, the administrator builds policies with the capabilities, the policy is chosen when the VM is being created/deployed and the VM’s VVols are created in such a way so as to match the policy requirements.
These capabilities with vary from storage array vendor to storage array vendor, but think of capabilities like dedupe, compression, encryption, flash acceleration, etc, etc. There is no hard and fast list of VVol capabilities – it all depends on the array. If the array support the capability, then VVol can consume them. The VASA Provider, reference earlier is how these capabilities are exposed to vCenter, and this is all under the control of the array vendor.
Now these capabilities can be chosen on a per VM basis, and the resulting VVol will be placed on the appropriate storage container that can offer these capabilities. the policy can then be checked for compliance throughout the life-cycle of the VM, ensuring that the VM has the required storage feature set. When you hear VMware talking about Software Defined Storage, this is at its very core.
A final point is about the policies. Each policy can have multiple rule-sets, a different rule-set from a different vendor. If the rule-set related to one array cannot satisfy the requirements in the policy, then perhaps the rule-set from another vendor can. Of course, you will need multiple storage containers with different capabilities (or multiple VVol capable arrays from different vendors) for this to work, but hopefully you can see how powerful this feature is.
What about VAAI?
This was in fact one of my early questions – what does VVols means for VAAI? Let’s look at each of the VAAI primitives and discuss any differences. Remember that just like VAAI, the individual array vendors need to support the primitive for it to work with VVols.
ATS (Atomic Set and Test)
- Supported – There is a need to provide clustered file-system semantics and locking for the Config VVOL (VM home object). Therefore ATS is used for locking.
XCOPY (Cloning, Linked Clones)
- Supported – vSphere has the ability (via API calls) to instruct the array to clone an object (Virtual Volume) on our behalf
- Supported – Keep in mind that there is no file-system with VVols. Therefore any space in the storage container that is reserved by a Virtual Volume, can be automatically reclaimed by Storage Array upon the deletion of that VVol.
- [Update] On further discussion internally, this additional benefit of VVols was highlighted. Without VMFS as a layer in between, UNMAP commands generated by the Guest OS now go straight to the array. That means Windows Server 2012 (and, I understand Windows 7 as well) will immediately be sending UNMAPs to the storage (for block-based VVols).
Thin Provisoning Out of Space (OOS)
- Supported – The storage container ‘Out of Space’ warnings will be advertised to vSphere
A few additional notes about VAAI and VVols. A common question is whether or not VVols and LUNs/datastores from arrays that use VAAI can be presented/co-exist on the same ESXi host. The answer is absolutely, yes. In this case, should there be a request to clone a VMDK from datastore to VVol, VAAI will be used to clone from VAAI enabled datastore to a Virtual Volume
The other interesting point is around VAAI-NAS, which had a different set of primitives when compared to VAAI on block storage. VVols now levels the playing field. For example:
- NAS vendors are no longer required to write plug-ins. Array capabilities are simply advertised through the VASA API
- Historically, Storage VMotion operations on VMs with snapshots running on NFS were not offloaded. Virtual Volumes removes that limitation on NFS, allowing live migrations of snapshots to be offloaded to the array.
- VVol also brings *space efficient* Storage VMotion for an NFS (VVOL) based VM. It is now possible to determine the allocated (written-to) blocks and only migrate that data. Acquisition of the allocated blocks was not (and still is not) possible using traditional NFS.
- Conversely, when it came to Fast File Clone or Linked Clones (offload to native snapshots ), this was only available via VAAI-NAS primitives, not block. Virtual Volumes removes the NFS only restriction.
- [Update] The whole area of Storage VMotion offloads with VVols is quite detailed. Expect a new blog in this area shortly.
We’ve waited a long time for this feature. This is a game changer in the storage space in my humble opinion.