Auto LUN Discovery on ESXi hosts
Did you know that any newly presented LUNs/paths added to an already discovered target will automatically be discovered by your ESXi host without a rescan of the SAN? In this example, I currently see two iSCSI LUNs from my NetApp array:
Let’s see what happens when I add new devices to my ESXi host from a new target.
My next step is to add a new LUN, but this time from my EMC array (which is a brand new target). Using the CLI on my EMC array, I can add a new LUN (ID 10) as follows to my ESXi host:
Now if I initiate a rescan from my vSphere Client, this new LUN from my EMC array appears. I have to do this manual rescan in order to discover the target for the first time (in this context, target represents a controller port on the EMC storage array).
The EMC array shows up as target 2 (T2) in the RunTime name above. Now I go ahead and present a second LUN (ID 20) from my EMC array to my ESXi host.
Since the target has already been discovered, there is no need to automatically rescan. The new LUN shows up automatically after a few moments on my ESXi host and in my vSphere Client:
I guess the next question is what triggers the rescan. There are in fact two cases which will discover new or removed LUNs or device paths automatically. The first of these is the periodic probe which runs every 5 minutes (tunable) and works with existing targets to discover new paths. The other is an adapter event which is triggered when a new target is found or an old target is removed. We use the SCSI Report LUNs command where possible but will also use SCSI Inquiries. The advanced setting is Disk.DeviceReclaimTime.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan
Hello, great post!
Is this behavior new for ESXi 5.0 and above? we see a lot of errors on our vmax since upgrading to ESXi 5.0 U1 and the vmax guys complain that ESX servers are always trying to touch the snap vdevs. is this Disk.DeviceReclaimTime responsible only for for the Automatic Lun Discovery feature, and if so can it be turned off or safely increased?
Are you seeing these at 5 minute intervals? If so, slightly tweak the 300 second value (either higher or lower) by 30 to 60 seconds. This will confirm if the attempts to discover path changes/new devices is related. However, the ESXi hosts is simply issue standard SCSI REPORT_LUN and/or INQUIRY commands. This should not cause errors, so something else must be askew on the array side, fyi. And no, this is not new to 5.x. It has been around for a long time.
Thank you for your answer Cormac! The errors we have seen in the vmax array are about ESXi trying to reach devices that are in a “Not Ready” state, e.c RM snapshots and SRM test snaps. I’m guessing this is not the case then since we didn’t have those errors with ESXi 4.1, but I will give that setting a test drive to see if it improves anything. Thanks!
Cormac, May i know if the report LUN command is used even when the dead paths come back up ?
The Reason from my above question is
RSCN can also Notify ESX host about some events :
*Nodes joining or leaving the fabric (most common usage)
*Switches joining or leaving the fabric
*Changing the switch name
*change caused by a server restart or a new product being added to the SAN
Every time these event occur will the ESX host send REPORT_LUN and/or INQUIRY commands?
Yes, my understanding is that these events will also generate a REPORT_LUN and/or INQUIRY command. Without these events occurring however, the host will automatically trigger a REPORT_LUN and/or INQUIRY command every 300 seconds.