Site icon CormacHogan.com

Getting to grips with NFSv4.1 and Kerberos

Over the past few weeks, I’ve been looking to update some of our older white papers on core storage topics. One of the outdated papers was on NFS, and a lot had changed in this space since the paper was last updated. Most notably, was the introduction of support for NFS v41 in vSphere 6.0, along with Kerberos based authentication. In vSphere 6.5, we also added Kerberos integrity checking. I decided to have a go at configuring this in my own lab. Before going any further, I need to thank Justin Parisi of NetApp for this guidance through this setup. He’s even gone ahead and written up an excellent blog post describing the steps using the NetApp OnTap appliance. This should be the first place to go for guidance on how to do this setup. As Justin states in his post, setting this up is a PITA. You’ll see why soon. What follows are some of my own observations, trials and tribulations on trying to get this work in my own lab.

First off, let’s consider the lab. My environment consisted of:

Simple setup steps for ESXi hosts:

More advanced setup steps for ESXi hosts:

See what I mean about this being a PITA! Anyways, that is pretty much everything that needs to be done for the ESXi host side of things for the moment. Let’s turn our attention to the target side of things next. In my case, this is a NetApp OnTap Simulator (version 9.2).

Setup steps for NetApp simulator:

With all of this in place, we are ready to go through the final few steps to support Kerberos based authentication for NFS v4.1 datastores.

A word of advice: At this point, create a volume with an export policy and verify that you can successfully mount this NFS v4.1 volume using AUTH_SYS authentication rather than Kerberos from your ESXi hosts. It would be worth validating your data paths and exports before trying any Kerberos related stuff and adding more complexity to the mix.

Kerberos setup steps on the NetApp simulator:

Step 1 is to setup the Kerberos realm. This basically mirrors my active directory configuration. I called it the same as my AD domain (rainpole.com) but used uppercase letters, so it is RAINPOLE.COM. Basically the setup simply involved adding details about the AD environment.

This next bit was the one that really had me confused. It is the Kerberos interface, essentially enabling Kerberos on the data paths of the SVM. Here is an example of one of my interfaces:

In my setup, my SVM had two data interfaces, netappc and netappd. These were both in DNS, will forward and reverse lookups. In this example, we are looking at interface netappc. The Kerberos Realm is RAINPOLE.COM, mentioned previously. Now the Service Principal Name takes the following format: nfs/<fqdn-of-my-interface>@Kerberos-Realm. Therefore my SPN is nfs/netappc.rainpole.com@RAINPOLE.com. The Admin username and password are only required for the enabling and disabling of Kerberos on the interface as this SPN is added to AD, as shown below (note the odd names that they take in AD):

The SPN can now be queried from AD using the following commands (thanks to Justin again for his help here).

PS C:\Users\Administrator> Get-ADComputer nfs-netappd-rai -Properties servicePrincipalName

DistinguishedName : CN=NFS-NETAPPD-RAI,CN=Computers,DC=rainpole,DC=com
DNSHostName : NFS-NETAPPD-RAI.RAINPOLE.COM
Enabled : True
Name : NFS-NETAPPD-RAI
ObjectClass : computer
ObjectGUID : 18454529-c6ef-4d93-bc33-99c6f2d830b8
SamAccountName : NFS-NETAPPD-RAI$
servicePrincipalName : {nfs/netappd.rainpole.com, nfs/nfs-netappd-rai.rainpole.com, nfs/NFS-NETAPPD-RAI,
 HOST/nfs-netappd-rai.rainpole.com...}
SID : S-1-5-21-1660322180-797832923-1225732573-5694
UserPrincipalName :

Note the servicePrincipalName line. Note that the first entry has netappd.rainpole.com, which is the FQDN of our interface. As long as these match up, you are good to go (I spun my wheels here for the longest time, trying to figure out what was correct and what was not). Again, note that these only appear in AD once the Kerberos interface are configured. Make sure these are visible and correct before going any further. You’ll have a unique entry for each interface.

The final piece of this setup is to do the same change that we did to the ESXi hosts previously. In Active Directory Administrative Centre, select each NetApp SPC in turn, open their Properties, go to the Extensions section, click the Attribute Editor, and scroll down to the msDS-SupportedEncryptionTypes field. Edit this field, and provide the value of 24 (0x18).

Exporting an NFS volume

This can be summarized in 3 steps:

The export policy is critical. This is where I had most difficult. You need to be aware of the first field, which is the Client Specification. It does not seem to like CIDR formats (other than 0.0.0.0/0) or hostnames/FQDNs. I spent ages figuring out why every time I tried to mount a volume, it failed as follows:

WARNING: NFS41: NFS41FSGetRootFH:4234: Lookup nas02_data_1 failed for volume vol2: Permission denied
WARNING: NFS41: NFS41FSCompleteMount:3762: NFS41FSGetRootFH failed: Permission denied
WARNING: NFS41: NFS41FSDoMount:4399: First attempt to mount the filesystem failed: Permission denied
WARNING: NFS41: NFS41_FSMount:4683: NFS41FSDoMount failed: Permission denied

Once I used the IP address of the ESXi hosts in the Client Specification, it all started to work as expected.

Setting up NFS users

The final part of the puzzle is the requirement to create some users on the NetApp. One of these is the SPN user called “nfs” and the other is the user we used on the ESXi side (“chogan”) to establish NFS Kerberos credentials. Interestingly, I seemed to be able to mount my NFS volumes without having the “nfs” user but I definitely needed the NFS Kerberos credentials user (“chogan”) created on the NetApp side. Without this user defined, I got the following when trying to mount NFS v41 volumes using Kerberos authentication:

WARNING: NFS41: NFS41FSWaitForCluster:3637: Failed to wait for the cluster to be located: Timeout
WARNING: NFS41: NFS41_FSMount:4683: NFS41FSDoMount failed: Timeout
StorageApdHandler: 1062: Freeing APD handle 0x430c89c16d70 []
StorageApdHandler: 1147: APD Handle freed!
WARNING: NFS41: NFS41_VSIMountSet:431: NFS41_FSMount failed: Timeout
.
.
WARNING: SunRPC: 742: Failed to send NULLPROC for xid 0x1e42b9be: RPC connection reset 0xe

This is also the failure I got when Kerberos was not configured on the SVM interfaces. So there is some behaviour here that I still need to figure out.

 

Checking status of NFS v41 with Kerberos from CLI

It is possible to tell the authentication type used to mount an NFS v41 volume from the CLI. The security column from the following command tells you. If the Security is SEC_KRB5, then Kerberos has been used. If it is AUTH_SYS, then it hasn’t used Kerberos and used the “normal” authentication mechanism. Ignore the hosts listing. As I mentioned, my SVM had two interfaces and I could mount my volumes on either netappc or netappd. Unfortunately there is no NFS v41 multipath support on the Netapp at this time, so I can’t do much with it. One final note – vol3 is using SEC_KRBI, Kerberos authentication and data integrity. The setup steps are the same.

[root@esxi-dell-e:~] esxcli storage nfs41 list
Volume Name  Host(s)          Share          Accessible  Mounted  Read-Only  Security    isPE  Hardware Acceleration
-----------  ---------------  -------------  ----------  -------  ---------  ---------  -----  ---------------------
vol4         netappc          vol4                 true     true      false  SEC_KRB5   false  Not Supported
vol3         netappd,netappc  /vol3                true     true      false  SEC_KRB5I  false  Not Supported
vol2         netappd,netappc  /nas02_data_1        true     true      false  AUTH_SYS   false  Not Supported
vol1         netappc          /nas01_data_1        true     true      false  SEC_KRB5   false  Not Supported
 

Conclusion

I hope you find this useful. As I said at the beginning, please go to Justin’s blog for more in-depth step-by-step instructions. I still have a few questions about how all of this hangs together, and some other weird behaviour that I’m seeing (probably some future blogs). Hopefully my own personal observations on what is involved in this setup will also be beneficial to you in some way.

Exit mobile version