Heads Up! Device Queue Depth on QLogic HBAs

Just thought I’d bring to your attention something that has been doing the rounds here at VMware recently, and will be applicable to those of you using QLogic HBAs with ESXi 5.x. The following are the device queue depths you will find when using QLogic HBAs for SAN connectivity:

  • ESXi 4.1 U2 – 32
  • ESXi 5.0 GA – 64
  • ESXi 5.0 U1 – 64
  • ESXi 5.1 GA – 64

The higher depth of 64 has been this way since 24 Aug 2011 (the 5.0 GA release). The issue is that this has not been documented anywhere. For the majority of users, this is not an area of concern and is probably a benefit. But there are some concerns.

But for those of you with a significantly large number of LUNs presented to an ESXi host with a QLogic HBA, this may be a consideration. Each device will now consume 64 slots rather than 32 slots in the adapter queue (which for QLogic is 4096). For instance, a queue depth of 32 will allow you to have 128 LUNs (32 x 128 = 4096). Now with this raised to 64, this reduces the number of LUNs that may have this full queue depth value to 64 (64 x 64 = 4096). If you hit the adapter queue limit, then you won’t be able to reach the device queue depth, and may possibly have I/Os retried due to queue full conditions. You can use esxtop to monitor queue depth usage under disk statistics.

Why did we change it? The reason was for Storage I/O Control (SIOC) improvements. It was increased to handle situations when there were a large number of VMs per datastore and to give SIOC a queue depth of 64 to play with rather than 32 for VM performance/fairness.  If the VMs on one host are more active in terms of I/O activity and need a larger share of the I/O bandwidth to the same device/LUN, this gives SIOC the ability to do more for these VMs.

The bottom line is that this is a significant change between 4.x and 5.x (and something that I wasn’t aware that we had even changed). It is something to keep in mind if you are planning to upgrade to 5.x, or if you have already moved to 5.x and you use QLogic HBAs for SAN connectivity with a lot of LUNs/devices. Emulex device queue depth settings have not changed between releases. It is still 30 if I remember correctly.

27 Replies to “Heads Up! Device Queue Depth on QLogic HBAs”

    1. Hey Cormac,

      This is absolutely spot on. I remember fighting hard to get that change checked in for ESX5 while I was in the DRS team. Indeed, this was necessary for better support for performance differentiation using Storage IO Control.

  1. Very timely, great article. One of customer was questioning what does it mean to have fewer LUNS or one large LUN, and how does an adapter manages multiple queues and its effect on I/O performance. What settings should he be setting for queue depths in either case? After many hours of driver code reading, i had to figure out the best value for the number of LUNs that were deployed on our box.

  2. Hi Cormac,

    Thank you for sharing these details. A quick query; what is the adapter queue limit of Emulex HBA..?

  3. Thanks for your response. Just to clarify, I was looking for the queue limit at Adapter level. You mentioned that its 4096 for Qlogic; what is the relevant value for Emulex..? Tried looking on net but couldn’t get anything concrete!

    1. Ah – I see, sorry. Well, Qlogic used to expose this information in the proc nodes of the ESXi host. Emulex didn’t expose this info, so I’m not sure what value they use. I’ll see if I can find out.

  4. Today the numbers are different on Qlogic cards. I can see that max execution throttle value is 65536. Also with flash storage the queue depth could be mach larger.

  5. Hi Cormac,

    Do you know the default parameters inside the VM ? I mean for the vHBA and the Guest OS.
    For LSI and Paravirtualized SCSI Controller also depends on QLogic HBAs ?

    Thanks !

    1. It varies from Guest OS to Guest OS and virtual adapter to virtual adapter. Usually you can find this in the registry of Windows machines. There should be some KB articles on kb.vmware.com which will tell you how to find it.

  6. Hey Cormac.

    Great insights. Thanks. I assume this is all for FC HBAs correct?
    When using say an iSCSI HBA or a CNA is iSCSI mode. Do any of the adapter side queues different from the for the various vendors that you know of?

    Thanks.
    Aboubacar

    1. Yep – FC HBAs. Sorry, haven’t looked at the other HBAs/CNAs – just not seeing enough of them I guess.

  7. Hi Cormac,
    Just found this, thanks for the info! You mention with a queue depth of 64 the host hba can max out at 64 luns.. what impact would this have on storage array front end ports? Will this overrun the array port and cause queue full conditions?
    Thanks!

  8. disk.SchedNumReqOutstanding still relevant in vSphere 5.5. Can’t find in web or c# client. New way to set?

    1. Yes, but now it is set on a per device basis.

      esxcli storage core device set –device naa.xxx –sched-req-num-outstanding

      1. To check the current value for a device run this command

        esxcli storage core device list -d naa.xxx

        The value appears under “No of outstanding IOs with completing worlds:”

        http://kb.vmware.com/kb/1268 will be updated with this info, Subscribe to the document to get an update when it is done so.

        1. Hi Cormac. I am going through a large storage vendor migration so I have a lot more luns presented than normal. With the new Queue depth revelation of 64 for the HBA I see a value of 32 when using the esxcli storage core device list -d naa.xxx or looking at the older disk.SchedNumReqOutstanding.

          To prevent a possible queue full condition can we adjust the Adapter queue back down to 32 and if so how?

Comments are closed.