I just received notification about KB article 2016122 which VMware has just published. It deals with a topic that I’ve seen discussed recently on the community forums. The symptom is that during periods of high I/O, NFS datastores from NetApp arrays become unavailable for a short period of time, before becoming available once again. This seems to be primarily observed when the NFS datastores are presented to ESXi 5.x hosts.
The KB article described a work-around for the issue which is to tune the queue depth size on the ESXi hosts which will reduce I/O congestion to the datastore. By default, the value of NFS.MaxQueueDepth is 4294967295 (which basically means unlimited). The workaround is to change this value to 64. This has been shown to prevent the disconnects. A permanent solution is still being investigated.