Storage I/O Control Enhancements in vSphere 5.0

Storage I/O Control (SIOC) was introduced in vSphere 4.1 and allows for cluster wide control of disk resources. The primary aim is to prevent a single VM on a single ESX host from hogging all the I/O bandwidth to a shared datastore. An example could be a low priority VM which runs a data mining type application impacting the performance of other more important business VMs sharing the same datastore.

存储I/O控制加强_休闲

Configuring Storage I/O Control

Let's have a brief overview of how to configure SIOC. SIOC is enabled very simply via the properties of the datastore. This is a datastore built on a LUN from an EMC VNX 5500:

存储I/O控制加强_休闲_02

The Advanced button allows you to modify the latency threshold figure. SIOC doesn't do anything until this threshold is exceeded. By default in vSphere 5.0, the latency threshold is 30ms, but this can be changed if you want to have a lower of higher latency threshold value:

存储I/O控制加强_职场_03

Through SIOC, Virtual Machines can now be assigned a priority when contention arises on a particular datastore. Priority of Virtual Machines is established using the concept of Shares. The more shares a VM has, the more bandwidth it gets to a datastore when contention arises. Although we had a disk shares mechanism in the past, it was only respected by VMs on the same ESX host so wasn't much use on shared storage which was accessed by multipe ESX hosts. Storage I/O Control enables the honoring of share values across all ESX hosts accessing the same datastore.

The shares mechanism is triggered when the latency to a particular datastore rises above the pre-defined latency threshold seen earlier. Note that the latency is calculated cluster-wide. Storage I/O Control also allows one to tune & place a maximum on the number of IOPS that a particular VM can generate to a shared datastore. The Shares and IOPS values are configured on a per VM basis. Edit the Settings of the VM, select the Resource tab, and the Disk setting will allow you to set the Shares value for when contention arises (set to Normal/1000 by default), and limit the IOPs that the VM can generate on the datastore (set to Unlimited by default):

存储I/O控制加强_vmware_04
More information on Storage I/O Control can be found in this whitepaper.

SIOC Enhancements in vSphere 5.0

In vSphere 4.1, SIOC was supported for block storage devices (FC, iSCSI, FCoE) only. In vSphere 5.0, we have introduced support for NAS devices. This means that we now have a mechanism which will prevent a single VM/ESXi on an NFS datastore from hogging all the bandwidth to that datastore. Once again, you just select the properties of the NFS datastore to enable SIOC. Here is a screen-shot showing the SIOC properties for an NFS datastore presented to the ESXi hosts from a NetApp FAS3170A array:

存储I/O控制加强_虚拟化_05
SIOC also has a new use case in vSphere 5.0, and that is of course Storage DRS. SIOC is used to initially gather information about datastore capabilities and is also used for gathering I/O Metrics from the datastores in an SDRS datastore cluster.

Common question

One question about Storage I/O Control which often arises is the following; If you have two hosts with equal shares, they will have equal queue lengths, so why do you observe different throughput in terms of Bytes/s or IOPS?

The reason for this is due to differences in per-IO cost and scheduling decisions made within the array. The array may process requests in the order it thinks are the most efficient to maximize aggregate throughput, causing VMs will equal priority to display slightly different throughputs.

Extenal Workloads

SIOC can only work when it has insight into all the workloads on a datastore. If there are external workloads into which SIOC has no visibility, then an alarm 'Non-VI workload detected on the datastore' will be triggered. For SIOC to perform optimally, you will need to address the reason for this external workload, and prevent it from re-occurring. This KB offers some very good advice on the subject - http://kb.vmware.com/kb/1020651.

Best Practice/Recommendation

Storage I/O Control is a really great feature for avoiding contention and thus poor performance on shared storage. It gives you a way of prioritizing which VMs are critical and which are not so critical from an I/O perspective. I highly recommend enabling it in your environments if you can. I believe that Storage I/O Control is essential to provide better performance for I/O intensive and latency-sensitive applications such as database workloads, Exchange servers, etc.