Automating the IOPS setting in the Round Robin PSP

Cormac

11 years ago

A number of you have reached out about how to change some of the settings around path policies, in particular how to set the default number of iops in the Round Robin path selection policy (PSP) to 1. While many of you have written scripts to do this, when you reboot the ESXi host, the defaults of the PSP are re-applied and then you have to run the scipts again to reapply the changes. Here I will show you how to modify the defaults so that when you unclaim/reclaim the devices, or indeed reboot the host, the desired settings come into effect immediately.

Before we begin, lets just revisit this whole IOPS=1 setting in the round robin path selection policy. My pal Duncan has posted his concerns about this before. Just to recap, Round Robin PSP works best at scale. If you do some testing with a single VM on a single datastore, then you are going to see improvements with the IOPS=1 setting when compared to the default setting of 1000. However, once you start to scale (multiple VMs deployed across multiple datastores), then the default value provided by VMware should provide just as good performance. Now I’m not going to question how our storage partners are arriving at the recommendation for IOPS=1, but I hope they are testing the path selection policy at scale before arriving at the recommendation. Let’s move on.

Each storage array supported on the VMware Hardware Compatibility List (HCL) will have a Storage Array Type Plugin (SATP) associated with it. This decide what conditions will fail an I/O over to an alternate path (i.e. which SCSI sense codes). With each SATP, there is a default Path Selection Policy (PSP) which determines which path to use for I/O. One of these PSPs is Round Robin, a path selection policy which balances I/O across all active paths. When to choose the next path to send the I/O is based on certain criteria. By default, PSP will send 1,000 I/Os down the first path, then 1,000 I/Os down the next and so on in a round robin fashion. Many of our storage partners are recommending that this should be set to a single I/O before moving to the next path. HP recommend it here on page 29 and EMC recommend it here on page 18.

The command to associated the Round Robin PSP with a modified the IOPS setting is via this esxcli command:

# esxcli storage nmp satp rule add -s “TestSATP” -V “TestVendor” -M “TestModel” -P “VMW_PSP_RR” -O “iops=1”

In order to identify the “Vendor” and “Model” variables, you will need to do the following:

1. Perform a rescan on an ESX/ESXi host
2. Perform a “grep -i scsiscan /var/log/vmkernel.log”
3. In there, you will see the Vendor and Model for the device. Some examples are below:

0:00:00:29.585 cpu2:4114)ScsiScan: 1059: Path ‘vmhba36:C0:T0:L0’: Vendor: ‘Dell’ Model: ‘MD32xxi’ Rev: ‘7320’

2013-06-19T23:55:11.838Z cpu18:8728)ScsiScan: 888: Path ‘vmhba1:C0:T0:L0’: Vendor: ‘DGC ‘ Model: ‘RAID 5 ‘ Rev: ‘0430’
2013-06-19T23:55:11.838Z cpu18:8728)ScsiScan: 891: Path ‘vmhba1:C0:T0:L0’: Type: 0x0, ANSI rev: 4, TPGS: 3 (implicit and explicit)

You can see from the above examples that we have a Vendor “Dell” and a Model “MD32xxi” in the top line and Vendor “DGC” and Model “RAID 5” in the second.

If the above esxcli command is run with these Vendor and Model values, any devices discovered by the ESXi from this array type will have the Round Robin PSP automatically associated with it, but will also have the IOPS value set to 1. Note that if the Vendor ID contains trailing spaces, as in the case of the EMC DGC model, the trailing spaces must be included. If there are any claim options associated with the device (as is the case of TPGS for ALUA arrays), these must also be included.

In order to get the claimrule to load it would be far easier to reboot the box, otherwise you would have to unload each device already claimed, and re-load the claimrules. However, if you are not in a position to reboot, these are the steps to unclaim and reload the claimrules

# esxcli storage core claiming unclaim -t device -d naa.xxxx

Repeat the above command for all LUNs presented. Once that is done, run this load command followed by a rescan:

# esxcli storage core claimrule load
# esxcfg-rescan vmhbaX

One caveat to this is if the customer is using Microsoft Clustered RDM’s, if they are then they will need to manually change them to Fixed or MRU as those LUNs will also be claimed with Round Robin which in the current released versions of ESX/ESXi is not supported.

Hope you find this useful.