We at VMware have been making considerable changes to the way that the All Paths Down (APD for short) and PDL (Permanent Device Loss) conditions are handled. In vSphere 5.1, we introduced a number of enhancements around APD, including timeouts for devices that entered into the APD state. I wrote about the vSphere 5.1 APD improvements here. In vSphere 5.5 we introduced yet another improvement to this mechanism, namely the automatic removal of devices which have entered the PDL state from the ESXi host.
In a nutshell, PDL AutoRemove automatically removes a device with a PDL state from the ESXi host. Think about it – here we have a device that we know is never coming back based on the SCSI sense code we have gotten from the controller/array. Why wouldn’t we want to clean up this device? A PDL state on a device implies it cannot accept more I/Os, but it still needlessly uses up one of the 256 device per host limit on ESXi. With PDL AutoRemove, we now have an added benefit of the device being automatically removed from the ESXi host (since it is never coming back).
PDL AutoRemove occurs only if there are no open handles left on the device. The AutoRemove will happen when the last handle on the device closes.
One important point however – due to the nature of stretched/metro clusters, it is recommended that this setting be disabled in those environments. If you wish to read more about the reasons why and how, Michael Webster does a good job of explaining it in this blog article.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan