SIOC and datastores spread across all spindles in the array
This is a query which has come up on numerous occasions in the past, especially in the comments section of a blog post on debunking SIOC myths on the vSphere Storage Blog. This post is to highlight some recommendations which should be implemented when you have a storage array which presents LUNs which are spread across all spindles, or indeed multiple LUNs all being backed by the same set of spindles from a particular aggregate or storage pool.
VMware engineering is continuing to look at options available in such configurations, especially where a VM on LUN1 experiences high latency while VMs on other LUNs do not (for example, if the VM on LUN1 is doing a ‘worse case’ random workload while VMs on other LUNs are running sequential workloads). Basically, in these situations, there is a noisy neighbour VM which is not on the same datastore which can impact VMs on other datastores. How should/could SIOC handle this scenario?
Currently, the recommendations regarding SIOC deployment when datastores are spread across all spindles in the storage array are as follows:
- Enable SIOC in all the LUNs coming from the shared spindles, and
- Set the congestion threshold to be the same across all LUNs
This configuration will result in congestion detection on ALL the LUNs and hence will result in I/O being throttled back on all datastores.
There are two caveats:
- There is no proportional share enforcement across LUNs. Hence, if the sum of all shares on LUN1 is 100 and LUN2 is 1000, you will NOT get 1:10 throttle ration across LUNs. However, the I/O throttling at each LUN will respect the shares.
- There might be corner cases where the congestion threshold is set slightly differently if the automatic threshold setting mechanism is used (introduced in vSphere 5.1). There is a good chance that the model parameters may be slightly different when measured from different LUNs. As a result, the LUN with the lower threshold setting might see its performance not isolated due to load to some other LUN. Therefore the recommendation is to set the threshold to be the same as per the advice above.
I would also mention that if you have numerous vCenter servers, numerous clusters and numerous ESXi hosts all sharing access to the same array (albeit different LUNs) and you wish to use the SIOC feature, look at a storage management scheme whereby different pools/aggregates are created on a per vCenter basis.
Another important point is that neither datastores using an unshared set of spindles or datastores sharing the same set of spindles should be presented to ESXi hosts managed by different vCenter servers. As per KB 1020651, an unsupported configuration is “Not all of the hosts accessing the datastore are managed by the same vCenter Server” which may result in “An unmanaged I/O workload is detected on a SIOC datastore”. If the same storage aggregate/pool on the array is shared by multiple vCenter servers (even though they all access different LUNs), you may run into the noisy neighbour VM impacting VMs on different datastores. With separate pools/aggregates, and ensuring that all ESXi hosts accessing the datastores are managed by the same vCenter server, you can avoid this. To finish, if an unmanaged I/O workload is detected by SIOC, any I/O throttling will stop.
We understand that there is considerable storage planning around this, and as mentioned, we continue to look into this to see if there are way to improve upon SIOC so that it can determine noisy neighbour conditions occurring when datastores are backed by a shared set of spindles. More info as I get it.
For arrays where different policies exist effectively prioritizing higher portions of some LUN’s on higher tiers of storage (ie FAST VP with a gold policy of 100/100/100 and a silver policy of 5/100/100) would the recommendation still be to place the same congestion threshold on those LUN’s? My impression was it may be good practice to set a higher threshold on the silver policy and lower threshold on the gold policy. This is what EMC best practices state in fact.
Wasn’t sure if the recommendation above applied to any data residing on those shared pools or if it depended on whether different tiers were involved with multiple tiring policies.
I will qualify my response by saying that this is something which would need further testing. However, if you set a low threshold for gold, the I/O will be more aggressively throttled on the gold VMs than the silver VMs, which is unfair on the gold VMs (considering both gold and silver are backed by the same set of devices).
General consensus is that both gold and silver should have the same threshold. Do you have a reference to EMC documentation which recommends differently? Thanks.
Hi Cormac – Extremely delayed response 🙂 It used to be referenced in H2529 on page 219, but it appears newer versions of this document (link below) have since been updated to recommend not modifying the latency threshold, which would align with what you discussed in keeping them consistent.
http://www.emc.com/collateral/hardware/solution-overview/h2529-vmware-esx-svr-w-symmetrix-wp-ldv.pdf
Cheers
Hi Cormac, is there a way to limit SIOC throttling when the latency gets to high on the array? We got a situation on vsphere 5.0 U1 where SIOC push the QD to 2 on several ESX because of a bad SAN array latency (>100ms). The QD was so small that the vm where almost frozen so we end up with an even worst behavior than without SIOC. Thanks for your great post BTW.