I/O Scheduler Queues Improvement for Virtual Machines
This is a new feature in vSphere 6.0 that I only recently became aware of. Prior to vSphere 6.0, all the I/Os from a given virtual machine to a particular device would share a single I/O queue. This would result in all the I/Os from the VM (boot VMDK, data VMDK, snapshot delta) queued into a single per-VM, per-device queue. This caused I/Os from different VMDKs interfere with each other and could actually hurt fairness.
For example, if a VMDK was used by a database, and this database issued a lot of I/O, this could compete with I/Os from the boot-disk. This in turn could make it appear that the VM (Guest OS) is running slowly.
With this change in vSphere 6.0, each of the VMDK disks on a given device is given a separate scheduler queue. The storage stack now provides bandwidth controls to tune each of these queues for fair I/O sharing and traffic shaping.
This change is currently enabled by default in vSphere 6.0. It is controlled by a boot-time flag which can be disabled if required. If you need to turn it off, you can do this by adjusting the following parameter in the advanced system settings:
VMkernel.Boot.isPerFileSchedModelActive
This enhancement is currently only available for VMFS. It is not available for NFS at this time.
One area where we have observed significant performance improvement with this new scheduling mechanism is with snapshots. A 4TB snapshot can now be completed under 1/3rd of the time previously taken, while continuing to be fair to the guest OS I/Os.
I was always told that to get the best performance if having 2 or 3 VMDKs on a single vm, each should be on it’s own SCSI adapter (ie. SCSI address 0:0 for disk one, 1:0 for disk two and 2:0 for disk three). Does this no longer apply in version 6?
When I was playing around with the VCSA, I noticed 11 disks but only SCSI controller which surprised me because of what I mentioned above. Is this new feature in version 6 why the appliance only needs 1 controller for all those disks?
I wasn’t involved in the design of the VCSA 6.0, but this new feature will definitely improve IO fairness across all of those VMDKs.
HI Cormac,
Coming back to Albert’s Question, in 5.x I use to keep the vmdk’s on different scsi adapters like 0:0 …. 1:0 ….. 2:0 …. (if I have thress disks) so now with vsphere 6.x I dont have to do this all the disks can be on same SCSI controller ??? Please confirm.
Thanks
Good info. Do you know what I/O queue this affects (GQLEN, WQLEN or another) or does it affect the whole stack?
The IO scheduler changes only affects how we prioritize between the many IO streams/queues that exist for a LUN. In the past we would merge several IO streams into the same queue/stream, which means that you lose the ability to serve the originating streams in any special way.
The IO scheduler changes work such that you by default get a separate schedQ for every file opened in VMFS.
It does not affect any of the esxtop visible queue lengths/depths.
Actually the multi vmdk design is more about dynamic expansion of the linux partitions and how it links with lvm :
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2126276
Hi Cormac, your comment at the bottom of your post regarding snapshot commitment – does that mean that snapshot I/O (ie. “system” I/O, I/O which is not directly initiated by the guest) is classed as a separate queue?
Yes, system generated I/Os go to a separate kernel schedQ (e.g. scan I/Os etc.)
Hi Cormac,
I’m sorry to bother you but I don’t really get how this new feature changes the performances or priorities in vSphere 6.0.
I tried to think about 2 scenarios:
1) If I have shares configured on my vm’s disks, I already have my priorities managed in the queue of the LUN. Either at the ESXi or cluster level depending on the activation of SIOC. So this new feature would not change anything would it ?
2) If I have no shares configured, all my vm’s disks will be treated in the same way so… Or maybe it is only in this case, (when no shares are configured) that this new feature will take care of fairness between disks, in the same way that DSNRO introduces fairness between VMs ?
I’m not sure to really understand at which level this feature brings its improvements.