There was a very interesting discussion on our internal forums here at VMware over the past week. One of our guys had built out a VSAN cluster, and everything looked good. However on attempting to deploy a virtual machine on the VSAN datastore, he kept hitting an error which reported that it “cannot complete file creation operation”. As I said, everything looked healthy. The cluster formed correctly, there were no network partitions and the network status was normal. So what could be the problem?
This is the error that popped up when the VM was being provisioned on the VSAN datastore:
Note also the ‘Failed to connect to component host” message. This might give you a clue as to the root cause. This had many of us scratching our heads until one of our engineers asked a question about MTU settings on the VSAN network. MTU defines the maximum transmission unit (packet size/frame size) that can be sent over the network. In this case, an MTU of 9000 (jumbo frames) was configured on the switch. However in this setup, it seems that an MTU of 9000 on the switch (DELL PowerConnect) wasn’t large enough to match the MTU of 9000 required on the ESXi configuration. The switch actually required an MTU of 9216 (9 * 1024) to allow successful communication using jumbo frames on the VSAN network. Once this change was made on the switch, virtual machines could be successfully provisioned on the VSAN datastore.
So why didn’t VSAN report this as an issue? Currently, VSAN doesn’t use the larger jumbo frames to check that the cluster is correctly formed. Now that we know about this behaviour, we will be looking at addressing it going forward.
Note that the VSAN network also requires multicast, but the cluster will not form without that functionality.