vSphere 6.0 Storage Features Part 2: Storage DRS and SIOC
We made a number of enhancements to Storage DRS in vSphere 6.0. This article will discuss the changes and enhancements that we have made. There is a white paper which discusses many of the previous limitations of Storage DRS interoperability and I’d recommend reviewing it. Although a number of years old, it highlights many of the Storage DRS interoperability concerns. As you will see, a great any of these have now been addressed, along with some pretty interesting feature enhancements.
1. Deduplication Interoperability
In the white paper mentioned above, one of the items it discusses is using Storage DRS in conjunction with deduplication. It states that you must determine whether the deduplication process will be as efficient—that is, able to deduplicate the same amount of data—after the migration as it was before the migration. There is always a chance that this is not the case, and this might be a reason not to apply a recommendation to migrate a virtual machine with a large virtual disk.
If Storage DRS moves a virtual disk from datastore A to datastore B, and the datastores share a common backing pool for deduplication, it may simply inflate the virtual disk contents at datastore A and then re-index it again at datastore B without any real impact for the actual space usage.
The main issue for Storage DRS is that a datastore appears to store more data than it has capacity for. How does Storage DRS do placement on this datastore?
In Storage DRS in vSphere 6.0, VASA 2.0 now exposes if a datastore is being deduplicated and identify whether one or more datastores share common backing pools for deduplication. These enhancements enable Storage DRS to avoid moving VMs between datastores that are not in the same deduplication pool. However it does allow Storage DRS to manage logical space while keeping virtual disks in the same dedupe pool.
2. Thin Provisioned Datastore Interoperability
Let’s look once more at what the white paper says about thin provisioned datastores. Storage DRS by itself does not detect whether the LUN is thin or thick provisioned; it detects only the logical size of the LUN. However this logical LUN size could be much larger than the actual available capacity of the backing pool of storage on the storage array.
In previous versions of vSphere, Storage DRS leveraged the VMware vSphere APIs – Storage Awareness (VASA) thin-provisioning threshold. If the datastore exceeds the thin-provisioning threshold of 75 percent, VASA raises the thin-provisioning alarm. This causes Storage DRS to mark the datastore and prevent any virtual disk deployment or migration to that datastore to avoid running out of capacity.
However this did not address the situation where multiple datastores could be backed by the same pool of storage capacity on the array. In Storage DRS in vSphere 6.0, using VASA 2.0, the following changes were made to thin provisioned datastores interop:
- Discover the common backing pool being shared by multiple datastores
- Report the available capacity in the common backing pool
This allows Storage DRS to avoid migrating VMs between two thin provisioned datastores that share the same backing pool. Knowing the available capacity allows Storage DRS to make recommendations based on the actual available space in the shared backing pool rather than the reported capacity of the datastore (which may be larger).
Storage DRS can also provide remediation when the free space in the backing pool is running out by moving VMs away from datastores sharing the same common backing pool.
3. Array-based auto-tiering Interoperability
The interoperability white paper states that by default, Storage DRS is invoked every 8 hours and requires performance data captured over more than a 16 hour period to generate I/O load balancing decisions. However there are multiple storage vendors offering auto-tiering storage arrays, which move hot and cold chunks of data between the different pools of storage. Each array uses different time cycles to collect and analyze workload before moving LUN segments. Some auto-tiering solutions move chunks based on real-time workload; other arrays move chunks after collecting performance data for 24 hours. The misalignment of Storage DRS invocation and auto-tiering algorithm cycles makes it unpredictable when LUN segments might be moved, potentially conflicting with Storage DRS recommendations.
In Storage DRS in vSphere 6.0, we have made changes to array-based auto-tiering datastores interop so that these datastore with auto-tiering can now be identified via VASA 2.0 and treated differently these datastores differently for performance modeling purposes.
4. Site Recovery Manager (SRM) and vSphere Replication (vR) Interoperability
These have been known limitation for some time, and has been a top ask from our customers. I wrote about SRM and SDRS interop issues on the vSphere Storage blog back in the day. I also wrote a post about vSphere Replication interop issues as well. Now, vSphere 6.0 fixes the interop issues and you can now use Storage DRS and SRM/vR together.
Storage DRS recommendations are now also inline with replication policies. For instance some datastore might have asynchronous replication and others may have synchronous replication. It was possible that a VM was moved to a datastore with an incorrect replication type. Storage DRS will now make sure that a VM is placed/balanced on a datastore with the same policy. If there is a recommendation to move a VM to another datastore that does not have the same policy (e.g. maintenance mode), Storage DRS will alert that the datastore that the VM is being moved to may result is (temporary) loss of replication.
In the case of vSphere Replication, replica disks are instantiated on the secondary site. Storage DRS now understands the space usage of replica disks, and Storage DRS can be used for the replica disks on the secondary site. Previously, Storage DRS did not recognize these files. These can now be balanced in the same way as standard VM files.
5. Fix to limit the number of concurrent Storage vMotions
Storage DRS has some hard limits for concurrent Storage vMotions:
- Maximum number of Storage vMotions for I/O load balancing (default:3, max:10)
- Maximum number of moves per host (default:8, max:unlimited)
These limits did not appear to be adhered to, and we received many reports of very many Storage vMotion operations running concurrently when Storage DRS was enabled. This had a negative effect on the overall performance of the datastore in question. In vSphere 6.0, both of these settings are now honored, and customers have full control over the number of concurrent Storage vMotions that are initiated.
6. Storage DRS & SIOC support for IOPS reservations
IOPS reservations are something new, introduced with the mClock scheduler in vSphere 5.5. Previously, we could only set limits and shares on a VMDK. With the new mclock scheduler, we can also set an IOPS reservation. SIOC and Storage DRS both honor IOPS reservations. The I/O injector mechanism, previously used to automatically determine the latency threshold of a given datastore, has been enhanced to also determine the IOPS (4K, random read) of a given datastore. Storage DRS uses this information to determine the best datastore for initial placed to satisfy a VM’s IOPS reservations. It also uses it to do on-going load-balancing if there is an IOPS reservation violation. A nice new feature.
You may have notices that many of these enhancements to Storage DRS require VASA 2.0. The storage vendor will need to provide an updated VASA provider and the administrator will need to register a VASA Provider for the datastore before SDRS will perform the checks outlined above.
Hi Cormag. Great blog post. Thanks for keeping us informed about such details. Do you plan to write post about new disk scheduler mClock and IOPS reservations in detail? I don’t understand how I/O injector determine IOPS performance of particular datastores sharing same disk pool on shared storage because performance is shared. Can I/O injector recognize shared disk pools? Thanks.
If I get the cycles, I might David. Right now I’m focusing primarily on VSAN and some VVols, but later this year, time permitting, I may return to some of this core storage stuff. If I do, I’ll certainly look closer at this mechanism as I don’t believe it is documented in any great detail.
Does this mean that SIOC no longer interferes with SRM planned migrations?
and the problems in this article
http://blogs.vmware.com/vsphere/2012/06/sioc-and-srm.html
are a thing of the past?
Good question – let me reach out to Ken and get back to you.
That would be great, did you hear anything yet?
Hi Anthony,
So I managed to confirm that SIOC no longer interferes with SRM in 6.0. To verify,
1. Create a VM without on a replicated datastore.
2. Enable Storage I/O control on above datastore.
3. Create a protection group, recovery plan and run a failover from Site A to Site B, reportect from Site B to Site A
4. Configured Storage I/O control on the recovered datastore and run a failover (failback) from Site B to Site A and then reprotect from Site A to Site B.
You should be able to do all these workflows without SIOC interference.