NFS Best Practices – Part 2: Advanced Settings

Following on from part 1 of the NFS Best Practices, part 2 is going to look at tuning from a vSphere perspective. As mentioned, our objective is to update the NFS Best Practice white paper which is now rather dated. There are quite a number of tunable parameters which are available to you when using NFS datastores. Before we drill into these advanced settings in a bit more detail, it is important to understand that the recommended values for some of these settings may (and probably will) vary from storage array vendor to storage array vendor. My objective is to give you a clear and concise explanation of the tunable parameters and allow you to make you own decisions when it comes to tuning the values.

Maximum Number of NFS Mounts per ESXi host

By default, the NFS.MaxVolumes value is 8. This means that 8 is the maximum number of NFS volumes which can be mounted to an ESXi host. This can be changed, as VMware supports a maximum of 256 NFS volumes mounted to an ESXi host. However, storage array vendors make their own recommendation around this value. The recommendations to change the NFS.MaxVolumes value varies from storage array vendor to storage array vendor. It is best to check the appropriate documentation from the vendor. This is a per ESXi host setting and must be done on all hosts.

 TCP/IP Heap Size

Net.TcpIpHeapSize is the size of the memory (in MB) which is allocated up front by the VMkernel to TCP/IP heap. Net.TcpIpHeapMax is the maximum amount of memory which can be consumed by TCP/IP as heap.  In vSphere 5.1, the default value for Net.TcpIpHeapSizeis 0MB and the default value for Net.TcpIpHeapMax is 64MB. The maximum value for Net.TcpIpHeapMax is 128MB. As one changes the default NFS.MaxVolumes as discussed previously, one also needs to adjust the heap space settings for TCP/IP accordingly. Again, follow the advice from your storage array vendor. The vendors make different recommendations for these values. These are per ESXi host settings and again must be done on all hosts.

Heartbeats

The four heartbeat related settings (NFS.HeartbeatFrequency, NFS.HeartbeatDelta, NFS.HeartbeatTimeout & NFS.HeartbeatMaxFailures) can be discussed together. Basically, they are used for checking that an NFS datastore is operational. The NFS.HeartBeatFrequently is set to 12 seconds by default. This means that every 12 seconds, the ESXi host will check to see if it needs to request a heartbeat from an NFS datastore and ensure that the datastore is still accessible. To prevent unnecessary heartbeat’ing, the host only requests a new heartbeat from the datastore if it hasn’t done any other operation to the datastore, confirming its availability, in the last NFS.HeartbeatDelta (default: 5 seconds).

So what happens if the datastore is not accessible? This is where NFS.HeartbeatTimeout & NFS.HeartbeatMaxFailures come in.

The NFS.HeartbeatTimeout value is set to 5 seconds. This is how long we wait for an outstanding heartbeat before we give up on it. NFS.HeartbeatMaxFailures is set to 10 by default. So if we have to give up on 10 consecutive heartbeats, we treat the NFS datastore as unreachable. If we work this out, it is 125 seconds (10 heartbeats at 12 second intervals + 5 seconds timeout for the last heartbeat) before we decide an NFS datastore is no longer operational.

We do continue to make heartbeat requests however, in the hope that the datastore does become available once again.

I have not seen any vendor recommendations to change NFS.HeartbeatFrequency, NFS.HeartbeatTimeout or NFS.HeartbeatMaxFailures from the default, but again refer to the storage vendors documentation for any recommendations. I suspect the default values meet the requirements of most, if not all vendors.

Locking

Similar to heartbeat, the lock related settings (NFS.DiskFileLockUpdateFreq, NFS.LockUpdateTimeout & NFS.LockRenewMaxFailureNumber) can be discussed together.

To begin, the first thing to mention is that VMware isn’t using the Network Lock Manager (NLM) protocol for NFS locking. Rather, we are using our own locking mechanism for NFS. VMware implements NFS locks by creating lock files named “.lck-<file_id>” on the NFS server.

To ensure consistency, I/O is only ever issued to the file on an NFS datastore when the client is the lock holder and the lock lease has not expired yet. Once a lock file is created updates are sent to the lock file every NFS.DiskFileLockUpdateFreq (default: 10) seconds. This lets the other ESXi hosts know that the lock is still active.

Locks can be preempted. Consider a situation where vSphere HA detects a host failure and wishes to start a VM on another host in the cluster. In this case, another host must be able to take ownership of that VM, so a method to timeout the previous lock must exist. By default, a host will make 3 polling attempts (defined by NFS.LockRenewMaxFailureNumber) at 10 seconds intervals (defined by NFS.DiskFileLockUpdateFreq) to update the file lock. Each lock update attempt has a 5 seconds timeout (defined by NFS.LockUpdateTimeout).

In the worst case, when the last lock update attempt times out, it will take 3 * 10 + 5 = 35 seconds before the lock is marked expired on the lock holder client. Before the lock is marked expired, I/O will continue to be issued, even after failed lock update attempts.

Lock preemption on a competing client starts from the detection of lock conflict. It then takes 3 polling attempts with 10 seconds intervals for the competing host to declare that the lock has expired and break it. It then takes another 10 seconds for the host to establish its own lock. Lock preemption will be completed in 3 * 10 + 10 = 40 seconds before I/O will start to flow on the competing host.

I have not seen any storage vendors make recommendations to change these values from the default. Again, always double-check the vendor’s documentation to make sure.

Finally, it is extremely important that any changes to the lock settings is reflected on all hosts sharing the datastore. Having inconsistent lock settings across multiple hosts sharing the same NFS datastore can results in some very undesirable behaviour.

Do you change any other advanced settings when it comes to NFS datastores? If so, please leave a comment. I’d be interested in knowing more.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan

9 Replies to “NFS Best Practices – Part 2: Advanced Settings”

  1. Hey Cormac – great article! A few comments/questions:
    – Maximum Number of NFS Mounts per ESXi host – why is 8 the default? I am not aware of any vendor (at least not any of the big ones) recommending anything less than 32 at this time. Also, why is it storage vendor specific? I am not aware of needing to make this type of change on a traditional Linux system nor can I see why the storage vendor would care how many mounts a host was making.
    – “I have not seen any vendor recommendations to change NFS.HeartbeatFrequency, NFS.HeartbeatTimeout or NFS.HeartbeatMaxFailures from the default” In talking with EMC and NetApp, both suggested the changes laid out in this old blog entry by Chad and Vaughn: http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html
    – “Do you change any other advanced settings when it comes to NFS datastores?” The first three categories you mentioned (# mounts, tcp settings, heartbeat) are the only ones I have seen changed. Locking has actually never came up in my case.

    1. Hi Steve,
      Yes – I agree. The value of 8 is rather low. Maybe it is time to increase the default. You are correct regarding heartbeats, but I do not believe changing this is a best practice in any of the official documentation I’ve read from any of the vendors. If I’ve missed something, let me know. The only other advanced setting I’ve seen changed is NFS.MaxConnPerIP. I’m going to do a post on it shortly.

      1. Hey Cormac, I believe EMC needs to go through a similar exercise of updating best practices guides 🙂 Speaking from experience it is possible to run into an issue with Celerra / VNX File if you do not increase the heartbeat values – though the issue is typically application specific. When working with EMC support the documents referenced were the link I provided and section 3.6.1.5 of H5536-vmware-esx-srvr-using-emc-celerra-stor-sys-wp.pdf available on Powerlink. In terms of NetApp, the Virtual Storage Console does change the heartbeat settings on ESXi. As for documentation, surprisingly the best practices guides does not list this step, but I was able to find: http://www.vmworld.com/servlet/JiveServlet/previewBody/4861-102-1-6646/tr-3839.pdf and http://kb.vmware.com/kb/1012062. I hope this helps!

  2. Hi Cormac,

    In vSphere 5.1 i donot see the property “NFS.MaxConnPerIP”. In vSpshere 5.0 there used to be a setting with default value 4. Is there any change in behaviour from 5.0?

    1. Its actually SunRPC.MaxConnPerIP. The white paper will be published imminently – it’ll explain this setting in more detail.

Comments are closed.