CormacHogan.com

iSCSI on vSAN Stretched Cluster

vSAN readers will most likely be aware that we introduced support for iSCSI on vSAN way back in vSAN 6.5. That is to say, we had the ability to create iSCSI targets and LUNs using vSAN objects, and present the LUNs to external iSCSI initiators. That release also supported Persistent Group Reservations (PGRs) but it did lack transparent failover. We followed this up with an enhancement in vSAN 6.7 which enabled transparent failover. This enabled support for features like Windows Server Failover Cluster (WSFC) to work on iSCSI on vSAN, if using shared disk mode as it uses reservations on disk. WSFC using node majority or file share quorum were already supported on vSAN by the way. However, there is still a caveat surrounding vSAN iSCSI support and that is when it comes to stretched clusters. I continue to get questions about this configuration. At present (vSAN 6.7), we do not support iSCSI on vSAN stretched clusters, and I will describe the reason why shortly.

Let’s first describe a little about the iSCSI on vSAN architecture. With the iSCSI implementation on vSAN, there is the concept of a target I/O owner for vSAN iSCSI. The I/O owner is what the iSCSI initiator connects to. However, the I/O owner may be on a completely different vSAN node/host to the actual iSCSI LUN backed by a vSAN VMDK object. This is not a problem for vSAN deployments, as this can be considered akin to a VM’s compute residing on one vSAN host and the VM’s storage residing on a completely different vSAN host.  This ‘non locality’ feature of vSAN allows us to do operations like maintenance mode, vMotion, capacity balancing and so on without impacting the performance of the VM. The same is true for our iSCSI implementation – we should be able to move the I/O owner to a different host, and even migrate the iSCSI LUNs to different hosts while not impacting our iSCSI performance. This enables our iSCSI implementation to be unaffected by operations such as maintenance mode, balancing tasks, and of course any failures in the cluster.

Now, if we think about iSCSI LUNs on a vSAN stretched cluster, there is a possibility that a scenario could arise where the I/O owner is residing on one site in the stretched cluster, whilst the actual vSAN object backing the iSCSI LUN could be on the other site. In that case, all the traffic between the iSCSI initiator and the iSCSI target would have to traverse the inter-site link. But remember that this is already true for writes, since write data is written to both sites anyway (RAID-1). And when it comes to read workloads, we do have the ability to read data from the local site for both iSCSI and VM workloads, and not traverse the inter-site link. This means that it doesn’t really matter which site has the I/O owner resides. So what is the issue then?

The key issue is the location of the iSCSI initiator. If the initiator is somewhere on site A, and the target I/O owner is on site B, then in this case, the iSCSI traffic (as well as any vSAN traffic) will need to traverse the inter-site link. In a nutshell, we could end up adding an additional inter-site trip for iSCSI traffic and this is what needs to be addressed before we can offer full support for iSCSI on vSAN Stretched Cluster. We need to be able to offer some sort of locality between the iSCSI initiator and the target I/O owner.

We continue to scope out this requirement internally here at VMware. If you do have a pressing need to support iSCSI on vSAN stretched cluster, please let me know and I can forward your request.

Exit mobile version