NFS Best Practices – Part 2: Advanced Settings

Cormac

12 years ago

Following on from part 1 of the NFS Best Practices, part 2 is going to look at tuning from a vSphere perspective. As mentioned, our objective is to update the NFS Best Practice white paper which is now rather dated. There are quite a number of tunable parameters which are available to you when using NFS datastores. Before we drill into these advanced settings in a bit more detail, it is important to understand that the recommended values for some of these settings may (and probably will) vary from storage array vendor to storage array vendor. My objective is to give you a clear and concise explanation of the tunable parameters and allow you to make you own decisions when it comes to tuning the values.

Maximum Number of NFS Mounts per ESXi host

By default, the NFS.MaxVolumes value is 8. This means that 8 is the maximum number of NFS volumes which can be mounted to an ESXi host. This can be changed, as VMware supports a maximum of 256 NFS volumes mounted to an ESXi host. However, storage array vendors make their own recommendation around this value. The recommendations to change the NFS.MaxVolumes value varies from storage array vendor to storage array vendor. It is best to check the appropriate documentation from the vendor. This is a per ESXi host setting and must be done on all hosts.

TCP/IP Heap Size

Net.TcpIpHeapSize is the size of the memory (in MB) which is allocated up front by the VMkernel to TCP/IP heap. Net.TcpIpHeapMax is the maximum amount of memory which can be consumed by TCP/IP as heap. In vSphere 5.1, the default value for Net.TcpIpHeapSizeis 0MB and the default value for Net.TcpIpHeapMax is 64MB. The maximum value for Net.TcpIpHeapMax is 128MB. As one changes the default NFS.MaxVolumes as discussed previously, one also needs to adjust the heap space settings for TCP/IP accordingly. Again, follow the advice from your storage array vendor. The vendors make different recommendations for these values. These are per ESXi host settings and again must be done on all hosts.

Heartbeats

The four heartbeat related settings (NFS.HeartbeatFrequency, NFS.HeartbeatDelta, NFS.HeartbeatTimeout & NFS.HeartbeatMaxFailures) can be discussed together. Basically, they are used for checking that an NFS datastore is operational. The NFS.HeartBeatFrequently is set to 12 seconds by default. This means that every 12 seconds, the ESXi host will check to see if it needs to request a heartbeat from an NFS datastore and ensure that the datastore is still accessible. To prevent unnecessary heartbeat’ing, the host only requests a new heartbeat from the datastore if it hasn’t done any other operation to the datastore, confirming its availability, in the last NFS.HeartbeatDelta (default: 5 seconds).

So what happens if the datastore is not accessible? This is where NFS.HeartbeatTimeout & NFS.HeartbeatMaxFailures come in.

The NFS.HeartbeatTimeout value is set to 5 seconds. This is how long we wait for an outstanding heartbeat before we give up on it. NFS.HeartbeatMaxFailures is set to 10 by default. So if we have to give up on 10 consecutive heartbeats, we treat the NFS datastore as unreachable. If we work this out, it is 125 seconds (10 heartbeats at 12 second intervals + 5 seconds timeout for the last heartbeat) before we decide an NFS datastore is no longer operational.

We do continue to make heartbeat requests however, in the hope that the datastore does become available once again.

I have not seen any vendor recommendations to change NFS.HeartbeatFrequency, NFS.HeartbeatTimeout or NFS.HeartbeatMaxFailures from the default, but again refer to the storage vendors documentation for any recommendations. I suspect the default values meet the requirements of most, if not all vendors.

Locking

Similar to heartbeat, the lock related settings (NFS.DiskFileLockUpdateFreq, NFS.LockUpdateTimeout & NFS.LockRenewMaxFailureNumber) can be discussed together.

To begin, the first thing to mention is that VMware isn’t using the Network Lock Manager (NLM) protocol for NFS locking. Rather, we are using our own locking mechanism for NFS. VMware implements NFS locks by creating lock files named “.lck-<file_id>” on the NFS server.

To ensure consistency, I/O is only ever issued to the file on an NFS datastore when the client is the lock holder and the lock lease has not expired yet. Once a lock file is created updates are sent to the lock file every NFS.DiskFileLockUpdateFreq (default: 10) seconds. This lets the other ESXi hosts know that the lock is still active.

Locks can be preempted. Consider a situation where vSphere HA detects a host failure and wishes to start a VM on another host in the cluster. In this case, another host must be able to take ownership of that VM, so a method to timeout the previous lock must exist. By default, a host will make 3 polling attempts (defined by NFS.LockRenewMaxFailureNumber) at 10 seconds intervals (defined by NFS.DiskFileLockUpdateFreq) to update the file lock. Each lock update attempt has a 5 seconds timeout (defined by NFS.LockUpdateTimeout).

In the worst case, when the last lock update attempt times out, it will take 3 * 10 + 5 = 35 seconds before the lock is marked expired on the lock holder client. Before the lock is marked expired, I/O will continue to be issued, even after failed lock update attempts.

Lock preemption on a competing client starts from the detection of lock conflict. It then takes 3 polling attempts with 10 seconds intervals for the competing host to declare that the lock has expired and break it. It then takes another 10 seconds for the host to establish its own lock. Lock preemption will be completed in 3 * 10 + 10 = 40 seconds before I/O will start to flow on the competing host.

I have not seen any storage vendors make recommendations to change these values from the default. Again, always double-check the vendor’s documentation to make sure.

Finally, it is extremely important that any changes to the lock settings is reflected on all hosts sharing the datastore. Having inconsistent lock settings across multiple hosts sharing the same NFS datastore can results in some very undesirable behaviour.

Do you change any other advanced settings when it comes to NFS datastores? If so, please leave a comment. I’d be interested in knowing more.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan