Can Storage DRS now detect dedupe on the array?

The answer right now is no, but if you are interested in how this query came about, and why I decided to blog about it, continue reading. It has something for those of you interested in some of the underlying workings of Storage DRS.The initial query came from my pal Itzik over at EMC. In the vSphere 5.1 U1 release notes there is a fix called out for Storage DRS as follows:

Attempts to clone a virtual machine from a template fails when the destination datastore is a Storage DRS POD.

In vCenter Server, cloning a virtual machine from a template might fail when the destination datastore is a Storage DRS POD and the storage device has the de-duplication feature turned on. The following error is displayed:

Insufficient disk space on datastore xxxxx, this error message applies to all datastores in the POD.

This issue is resolved in this release.

One could deduce from this that Storage DRS can now understand de-duplication. That is not correct. This is an issue with the way Storage DRS calculated the used space on a datastore. What it did was to sum up the space consumed by each file backing a VMDK on each datastore in the datastore cluster. However, there were occasions where there was a substantial difference between Storage DRS calculated datastore usage and host reported datastore usage.  One reason identified for this is that de-duplication  (which consolidates the shared blocks) was enabled on the array.  There is no way for this consolidation to be reported back and thus it causes the miscalculation of the datastore usage.

This was not handled well in Storage DRS as it used the maximum of “datastore used space” and “sum of VMDK used space” in its estimation of currently used space on a datastore. As a result, new VM create requests via Storage DRS began failing even though there was more than enough unused space on the datastore.

The fix referred to in the vSphere 5.1U1 release notes changes the estimation of available space on a datastore to use one of the two ways, completely independently: either “datastore used space” as reported by datastore, or “sum of VMDK used space”; in isolation. The default is to now use the “datastore used space”.

Consider upgrading to 5.1U1 if you use Storage DRS and your storage arrays use array based deduplication technologies.