vSphere 5.1 Storage Enhancements – Part 3: vCloud Director

In this post, I want to highlight a number of storage improvements made in vSphere 5.1 that are going to be leveraged by the next release of vCloud Director.

Scalability

First off, we have the new file sharing scalability enhancements made in VMFS-5, which now allows up to 32 hosts to share a single file. This is covered in detail in part 1 of this vSphere 5.1 storage enhancements series of blog posts, but what this does mean for vCloud Director is that vApps deployed on linked-clones can now have many more hosts sharing the base disk on a VMFS-5.

VAAI NAS Offload

Sphere 5.0 introduced the offloading of linked clones for VMware View to native snapshots on the array via NAS VAAI primitives. You can read more about this here. vSphere 5.1 NAS VAAI enhancements will allow array based snapshots to be used for vCloud Director vApps based on linked clones, in addition to being used for VMware View.

When VMware vCloud Director does a fast provision of a vApp/VM, it will transparently use VAAI NAS to offload the creation of the subsequent linked clones to VAAI supported arrays.

Just like VAAI NAS support for VMware View in vSphere 5.0, this feature will also require a special VAAI NAS plug-in from the storage array vendor.

At the time of writing this article, NetApp already have this feature included in their next VSC release (4.1) which is currently in beta.

If “Fast Provisioning” is used on the Org vDC Storage settings AND the check box “Enable VAAI for fast provisioning” on the overall system Datastore settings is selected, it will trigger the right commands to use a native array-based snapshot for a linked clone instead of a standard redo log based one.

Profile Driven Storage Interoperability with vCloud Director

Storage Profiles are now represented in vCloud Director. Storage Profiles still must be configured from the vSphere layer, but they now surface up into vCloud Director. The storage profiles must first be added to a Provide vDC. For example, you might have Gold, Silver & Bronze storage profiles created. This then allows storage to be allocated and managed on a per ORG vDC. Again, continuing our example, this organization can only use datastores which are tagged as ‘Silver’. This support for Storage Profiles allows a high level of seperation between organizations at the storage level. Below is a snapshot of an ORG vDC with two storage profiles, one for iSCSI storage and one for NFS storage.

Profile Driven Storage with vCloud Director

If the Storage Profile associated with a vApp is changed (this can be done via the properties of a vApp), the vApp is automatically Storage vMotion’ed to a compliant datastore. It is great to see vCloud Director leveraging this excellent vSphere feature.

Storage DRS Interoperability with vCloud Director

One of the major enhancements in vSphere 5.1 is to provide interoperability between Storage DRS and vCloud Director. This essentially means that vCloud Director 5.1 now recognises datastore cluster objects from Storage DRS. Just like Storage Profiles, the configuration of Storage DRS is done at the vSphere layer, but the resulting datastore clusters and their respective configuration surface up into vCloud Director. In order for this interoperability to work, Storage DRS now understands linked clones (which it didn’t do previously). Going forward, vCloud Director can now use Storage DRS for initial placement, space utilization and I/O load balancing of vApps based on linked clones.

Snapshots

Snapshot Management in vCloud DirectorThe last feature introduced in vSphere 5.1 & vCloud Director 5.1 is the ability to take Virtual Machine snapshots from within vCloud Director. Previously one had to take these snapshots at the vSphere layer. As per the screen shot on the left, you can now Create, Remove and Revert a snapshot via the vCloud Director UI.

Although this might be considered a minor improvement, it does alleviate some additional administration which was necessary in previous versions of vSphere/vCloud Director.

I guess the next question then is how do you tell if you have a snapshot on the VM?

By default this information is not displayed on the Virtual Machine view. To show this information, select the option to display the Column headings which is on the right of the screen. Place a tick in the Snapshot column. You will now have a column denoting whether or not there is a snapshot for the Virtual Machine as per the diagram below,

vCloud Director Snapshot Management

It is nice to see these vSphere storage features being leveraged by vCloud Director. It’s especially nice to see some of the interoperability between products and features.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

vSphere 5.1 Storage Enhancements – Part 2: SE Sparse Disks

This is possibly the most exciting new storage feature in the vSphere 5.1 release. Space Efficient Sparse Virtual Disks (or SE Sparse Disks for short) were designed to alleviate two issues. Let’s describe these issues first of all.

Problem Statement #1 – Let’s take a Guest OS running on a linked clone (View desktop if you will), and this Guest OS issues a 4KB write. vmfsSparse disk (which is the format used by traditional linked clones) has a block allocation unit size of 512 bytes. In other words, this Guest OS is backed by 512 byte blocks. Depending on the applications deployed in the Guest OS, a worst case scenario is that these 512 byte blocks may not be contiguous on the VMDK, and thus may not be contiguous on the VMFS or NFS datastore. This could lead to multiple writes taking place on the back-end storage array for a single Guest OS write. Another side effect is that the partition created on Guest OS may also be misaligned (because of the very small allocation unit size), again causing multiple writes to take place on the array for a single Guest OS write. Finally, this 512 byte block allocation unit size may not match the block size preference of the storage array, leading to additional overhead in handling these smaller, partial writes.

Problem Statement #2 – The major space inefficiency issue of allocating as yet unused blocks in the Guest OS filesystem/database has basically been addressed by Thin Provisioning. However, another major space efficiency issues still exists – the issue of reclaiming Stale/Stranded data from within a Guest OS. While VMware has addressed this at the datastore level with the VAAI UNMAP primitive, it is still an issue from within the Guest OS. This is particularly problematic with VMware View Desktops deployed on linked clones. These desktops start off as very small in size, but over a period of time they will grow and may end up being as big as the base disk (again, worst case scenario). This then requires administrative intervention to reduce the size of the desktops.

Now that we understand the main issues, let’s see how the new SE Sparse Disk format helps to address them.

Addressing Issue #1 – By default the grain size/block allocation unit size for Virtual Machine disks on ESX is 4KB. The vmfsSparse format, used by snapshots and linked cloned have a grain size of 512 bytes or 1 sector. The vmfsSparse format get 16MB chunks at a time from VMFS, but then allocates it at 512 bytes at a time. This is the root cause of many of the performance/alignment complaints that we currently get with linked-clones/snapshots, and what we are addressing with SE Sparse Disks.

With the introduction of SE Sparse disks, the grain size/block allocation unit size is now tuneable and can be set based on the preferences of a particular storage array or application. Note however that this full tuning capability will not be exposed in vSphere 5.1.

Addressing Issue #2 – One of the major features of the new SE Sparse Disk is its ability to reclaim previously used space within the Guest OS. This stale data is data that was previously written to, but is currently in unaddressed blocks in a file system/database. Customers used to have to carry out some very manual processes to reclaim this stranded space in the past, using a combination of Guest OS tools and vSphere technologies (e.g. sdelete followed by Storage vMotion).

There are two steps involved in the space reclamation feature; the first step is the wipe operation which scans the Guest OS looking for stranded space and reorganizes the Virtual Machine Disk to frees up a contiguous area of free space.

The second step is the shrink operation which initiates either a SCSI UNMAP operation (block devices) or a RPC truncate (NFS) to delete the contiguous area of free space at the end of the VMDK, reducing its size, and then telling the storage array that it can now reclaim that area of free space.

The Wipe operation is initiated by an API call to the VMware Tools running in the Guest OS. This will allow the task to be scheduled out of hours so that there is no impact on the desktops. This initiates a scan of the filesystem looking for unused filesystem blocks.

When we know which blocks are free, we get the vSCSI layer to reorganise the SE Sparse Disk by moving blocks from the end of the SE Sparse disk to unallocated blocks at the beginning of the SE sparse disk. The SE Sparse disk metadata contains a bitmap where 1 bit represents a 4KB block and indicates if the block is allocated or unallocated.

When there is a contiguous range of free space at the end of the SE Sparse Disk, a SCSI UNMAP command is sent to reclaim those blocks, and truncate/shrink the SE sparse disk. Note that this is the same UNMAP primitive which we introduced in VAAI improvements in vSphere 5.0, so this will cause overhead on the storage arrays and could have a significant impact on performance for some storage arrays, just like dead space reclamation for VMFS-5 deployed on Thin Provisioned LUNs. This is why the recommendation is to run this reclaim feature out of hours or during a maintenance window.

During the shrink operation, allocated blocks at the end of the SE Sparse disk are moved to unallocated space at the beginning of the disk. This will leave a contiguous unallocated section at the end of the SE Sparse disk which can be truncated during the shrink operation.

Note that the Virtual Machines require HWv9 to handle the SCSI UNMAP command in the Guest OS – earlier versions will not know how to handle this command.

Use Case
There is a very specific use case for SE Sparse Disks in vSphere 5.1. The scope of SE Sparse Disks in vSphere 5.1 has been restricted to a VMware View use case when VMware View Composer uses “Linked Clones” for the roll-out of desktops.

VMware View desktops will also benefit from the new 4KB grain size, as it addresses the partial write and alignment issues experienced by some storage arrays when the 512 bytes grain size found in the vmfsSparse format is used by linked clones.

SE Sparse Disks also give far better space efficiency to desktops deployed on this virtual disk format since it has the ability to reclaim stranded space from within the Guest OS.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

vSphere 5.1 Storage Enhancements – Part 1: VMFS-5

Welcome to the first in a series of posts related to new storage enhancements in vSphere 5.1. The first of these posts will concentrate on VMFS. There are two major enhancements to VMFS-5 in the vSphere 5.1 release.

VMFS File Sharing Limits Increase

Prior to vSphere 5.1, the maximum number of ESXi hosts which could share a read-only file on a VMFS filesystem was 8. This was a limiting factor for those products and features which used linked clones. Linked Clones are simply “read/write” snapshots of a “master or parent” desktop image. In particular, it was a limitation for vCloud Director deployments using linked clones for Fast Provisioning of vApps and VMware View VDI deployments using linked clones for desktops.

In vSphere 5.1, we are increasing this maximum number of hosts that can share a read-only file (or to state this another way, we are increasing the number of concurrent host locks) to 32. This will only apply to hosts running vSphere 5.1 and higher on VMFS-5. Now vCloud Director and VMware View deployments using linked clones can have 32 hosts sharing the same base disk image.

This makes VMFS-5 datastores as scalable as NFS for VDI deployments using VMware View and vCloud Director deployments when using linked clones.

It should be noted that versions of VMware View 5.0 (and earlier) limited the number of hosts which could use linked-clone based desktops to 8. This was true for both VMFS and NFS datastores. VMware View 5.1, released earlier in 2012, increased this host count to 32 for NFS on vSphere 5.0. With the next release of VMware View & vSphere 5.1, you can have 32 hosts sharing the same base disk with both NFS & VMFS datastores.

One final point – this is a driver only enhancement. There are no on-disk changes required on the VMFS-5 volume to benefit from this new feature. Therefore customers who are already on vSphere 5.0 and VMFS-5 need only move to vSphere 5.1. There is no upgrade or change required to their already existing VMFS-5 datastores.

VOMA – vSphere On-disk Metadata Analyzer

VOMA is a new customer facing metadata consistency checker tool, which is run from the CLI of ESXi 5.1 hosts. It checks both the Logical Volume Manager (LVM) and VMFS for issues. It works on both VMFS-3 & VMFS-5 datastores. It runs in a check-only (read-only) mode and will not change any of the metadata.  There are a number of very important guidelines around using the tool. For instance, VMFS volumes must not have any running VMs if you want to run VOMA. VOMA will check for this and will report back if there are any local and/or remote running VMs. The VMFS volumes can be mounted or unmounted when you run VOMA, but you should not analyze the VMFS volume if it is in use by other hosts.

If you find yourself in the unfortunately position that you suspect that you may have data corruption on your VMFS volume, prepare to do a restore from backup, or look to engage with a 3rd party data recovery organization if you do not have backups. VMware support will be able to help in diagnosing the severity of any suspected corruption issues, but they are under no obligation to recover your data.

I’m sure you will agree that this is indeed a very nice tool to have at your disposal.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage