Kubernetes Storage on vSphere 101 – NFS revisited

In my most recent 101 post on ReadWriteMany volumes, I shared an example whereby we created an NFS server in a Pod which automatically exported a File Share. We then mounted the File Share to multiple NFS client Pods deployed in the same namespace. We saw how multiple Pods were able to write to the same ReadWriteMany volume, which was the purpose of the exercise. I received a few questions on the back on that post relating to the use of Services. In particular, could an external NFS client, even one outside of the K8s cluster, access a volume from an NFS Server running in a Pod?

Therefore, in this post, we will look at how to do just that. We will be creating a Service that can be used by external clients to mount a File Share from an NFS Server running in a K8s Pod. To achieve this, we will be looking at a new Service type that we haven’t seen before, the Load Balancer type. This type means that by using this Service, our NFS Server will be associated an External IP address. This should allow our clients to access the NFS exports. If you have a K8s distribution that already has some Container Network Interface (CNI) already deployed that will provide these external IP addresses, e.g. NSX-T, then great. If not, I will introduce you to the MetalLB Load Balancer later in the post which will provide external IP addresses to your Load Balancer Services.

Please note that one would typically use a Load Balancer Service to load balance requests across multiple Pods that are part of a StatefulSet or ReplicaSet. However, we’re not going to delve into that functionality here we are just using it for external access. As I said in previous posts, I may look at doing a post about Services in more detail at some point.

To begin this demonstration, let’s create the NFS Server Pod. Let’s take a look at that now.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nfs-server-ext
  namespace: nfs
  labels:
    app: nfs-server-ext
spec:
  serviceName: nfs-service-svc-ext
  replicas: 1
  selector:
    matchLabels:
      app: nfs-server-ext
  template:
    metadata:
      labels:
        app: nfs-server-ext
    spec:
      containers:
      - name: nfs-server-ext
        image: gcr.io/google_containers/volume-nfs:0.8
        ports:
          - name: nfs
            containerPort: 2049a
          - name: mountd
            containerPort: 20048
          - name: rpcbind
            containerPort: 111
        securityContext:
          privileged: true
        volumeMounts:
        - name: nfs-export
          mountPath: /exports
  volumeClaimTemplates:
  - metadata:
      name: nfs-export
      annotations:
        volume.beta.kubernetes.io/storage-class: nfs-sc
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 5Gi

I am deploying my NFS Server as a StatefulSet, but with only a single replica. This simply means that should the Pod fail, the StatefulSet will take care of restarting it, etc. You can review StatefulSets here. What this manifest YAML does is instructs K8s to create a 5GB volume from the storage defined in my StorageClass nfs-sc and present it to the NFS server container as /exports. The NFS Server container (running in the Pod) is configured to automatically export the directory /exports as a File Share. Basically whatever size of volume we add to the manifest with the name nfs-export is automatically exported. I have opened 3 container ports for the NFS server, 2049 for nfs, 20048 for mountd and 111 for the portmapper/rpcbind. These are required for NFS to work. Let’s look at the StorageClass next:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nfs-sc
provisioner: kubernetes.io/vsphere-volume
parameters:
    diskformat: thin
    storagePolicyName: raid-1
    datastore: vsanDatastore

This StorageClass is referencing the VCP (vSphere Cloud Provider) storage driver called vsphere-volume. It is stating the volumes should be instantiated on the vSAN datastore, and the policy used should be the Storage Policy called raid-1. Nothing really new here – we have seen this many times in previous 101 posts. If you need to review StorageClasses, you can do that here.

Now we come to the interesting bit – the service. It includes a Load Balancer reference so that it can be assigned an external IP address. I am fortunate in that I have a lab setup with NSX-T, meaning that NSX-T is configured with a Floating IP Pool to provide me with external IP addresses when I need them for Load Balancer service types. Access to NSX-T is not always possible, so in those cases, I have used MetalLB in the past. To deploy a MetalLB load balancer, you can use the following command to create the necessary Kubernetes objects and privileges:

$ kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml 
namespace/metallb-system created 
serviceaccount/controller created 
serviceaccount/speaker created 
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created 
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created 
role.rbac.authorization.k8s.io/config-watcher created 
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created 
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created 
rolebinding.rbac.authorization.k8s.io/config-watcher created 
daemonset.apps/speaker created 
deployment.apps/controller created

Once the MetalLB Load Balancer is deployed, it is simply a matter of creating a ConfigMap with a pool of IP address for it to use  for any Load Balancer service types that required external IP addresses. Here is a sample ConfigMap YAML that I have used in the past, with a range of external IP addresses configured. Modify the IP address range appropriately for your environment. Remember that these IP addresses must be on a network that can be accessed by your NFS clients, so they can mount the exported filesystems from the NFS Server Pod:

$ cat layer2-config.yaml
apiVersion: v1 
kind: ConfigMap 
metadata:  
  namespace: metallb-system  
  name:config
data:  
  config: |    
    address-pools:    
    - name: my-ip-space      
      protocol: layer2      
      addresses:      
      - 10.27.51.172-10.27.51.178
$ kubectl apply -f layer2-config.yaml 
configmap/config created

Now when your service starts with type Load Balancer, it will be allocated one of the IP addresses for the pool of IP addresses in the ConfigMap.

Since my Container Network Interface (CNI) is NSX-T, I don’t need to worry about that. As soon as I specify type: LoadBalancer in my Service manifest file, NSX-T will retrieve an available address from the preconfigured pool of Floating IP addresses, and allocate it to my service. Here is my manifest for the NFS server service, which opens the same network ports as our NFS Server container running in the Pod:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: nfs-server-svc-ext
  name: nfs-server-svc-ext
  namespace: nfs
spec:
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111
  selector:
    app: nfs-server-ext
  type: LoadBalancer

Note the selector, and how it matches the label in the Pod. Backing Pods are both discovered and connected using the selector.

The next step is to go ahead and deploy the StorageClass and StatefulSet for the NFS Server. We will verify that the PVC, PV and Pod get created accordingly.

$ kubectl create -f nfs-sc.yaml
storageclass.storage.k8s.io/nfs-sc created

$ kubectl create -f nfs-server-sts-ext.yaml 
statefulset.apps/nfs-server-ext created 

$ kubectl get pv 
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                             STORAGECLASS   REASON   AGE 
pvc-daa8629c-9746-11e9-8893-005056a27deb   5Gi        RWO            Delete           Bound    nfs/nfs-export-nfs-server-ext-0   nfs-sc                  111m 

$ kubectl get pvc 
NAME                          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE 
nfs-export-nfs-server-ext-0   Bound    pvc-daa8629c-9746-11e9-8893-005056a27deb   5Gi        RWO            nfs-sc         112m 

$ kubectl get pod 
NAME               READY   STATUS    RESTARTS   AGE 
nfs-server-ext-0   1/1     Running   0          93s

The Pod is up and running. Now let’s deploy our Load Balancer Service and check the Endpoint get created correctly.

$ kubectl create -f nfs-server-svc-ext-lb.yaml
service/nfs-server-svc-ext created

$ kubectl get svc
NAME                 TYPE           CLUSTER-IP       EXTERNAL-IP                 PORT(S)                                        AGE
nfs-server-svc-ext   LoadBalancer   10.100.200.235   100.64.0.1,192.168.191.67   2049:32126/TCP,20048:31504/TCP,111:30186/TCP   5s

$ kubectl get endpoints
NAME                 ENDPOINTS                                         AGE
nfs-server-svc-ext   172.16.5.2:20048,172.16.5.2:111,172.16.5.2:2049   9s

$ kubectl describe endpoints nfs-server-svc-ext
Name:         nfs-server-svc-ext
Namespace:    nfs
Labels:       app=nfs-server-svc-ext
Annotations:  <none>
Subsets:
  Addresses:          172.16.5.2
  NotReadyAddresses:  <none>
  Ports:
    Name     Port   Protocol
    ----     ----   --------
    mountd   20048  TCP
    rpcbind  111    TCP
    nfs      2049   TCP

Events:  <none>

All looks good. I have marked the external IP in blue above. This address, allocated by NSX-T from my floating IP address pool, can be reached from other apps running in my environment, so long as they can reach the 192.168.191.67 IP address. Not only do I get an IP address, but I also get a DNS entry for my service (which is the same name as the service). There is also load balancing of requests to the IP address which redirects any requests across all of the back-end Pods that implement the service (although I only have 1 back-end Pod, so not really relevant here).

It is also useful to query the endpoints since these will only populate once the Pod has mapped/bound successfully to the service. It ensures that your labeling and selector are working correctly between Pod and Service. It also displays the IP address of the NFS Server Pod, and any configured ports.

With the Load Balancer service now created, it should mean that a pod, virtual machine or bare-metal server running NFS client software should be able to mount the share exported from my NFS server Pod.

Now, if you had used the MetalLB Load Balancer, then you would expect the external IP address allocated to the Load Balancer Service to be one of the range of IP addresses placed in the ConfigMap for the MetalLB.

You might ask why I don’t just scale out the StatefulSet to 3 replicas or something and allow the requests to load balance? The thing to keep in mind is that this NFS Server has no built in replication, and each Pod is using its own unique Persistent Volume (PV). So let’s say my client keeps connecting to Pod-0 and writes lots of data to Pod-0’s PV. Now Pod-0 fails so I am now redirected/proxied to Pod-1. Well, Pod-1’s PV will have none of the data that I wrote to POD-0’s PV – it will be empty since there is no replication built into the NFS Server Pod. Note that Kubernetes does not do any replication of data in a ReplicaSet or a StatefulSet – it is up to the application running in the Pods to do this.

Note: I usually use a showmount -e command pointed at the NFS server to see what it shares/directories it is exporting. From my Ubuntu client VM below, you can see that it is not working for the NFS server Pod, but if I point it at another NFS server IP address (in a VM) it works. I’m unsure why it is not working for my NFS Server Pod.

# showmount -e 192.50.0.4
Export list for 192.50.0.4:
/nfs *

# showmount -e 192.168.191.67
clnt_create: RPC: Port mapper failure - Unable to receive: errno 111 (Connection refused)
#

However, the rpcinfo -p command works just fine when pointed at the NFS Server Pod.

# rpcinfo -p 192.168.191.67
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100003    3   udp   2049  nfs
    100227    3   udp   2049  nfs_acl
    100021    1   udp  59128  nlockmgr
    100021    3   udp  59128  nlockmgr
    100021    4   udp  59128  nlockmgr
    100021    1   tcp  43089  nlockmgr
    100021    3   tcp  43089  nlockmgr
    100021    4   tcp  43089  nlockmgr
    100024    1   udp  35175  status
    100024    1   tcp  46455  status

And for the final test, can we actually mount the exported share from the NFS server Pod to my client VM sitting outside the cluster?

# mount -t nfs 192.168.191.67:/exports /demo
# cd /demo
# touch my-new-file
# ls
index.html  lost+found  my-new-file
#

LGTM. So what is happening here? In my example, my Load Balancer (provided by the NSX-T CNI) has provided an IP address for my Service. As the Service receives NFS client requests on what could be termed a virtual IP address of 192.168.191.67, these requests are being redirected or proxied to our back-end NFS Server Pod on 172.16.5.2. This is all handled by the kube-proxy daemon which we discussed briefly in the failure scenarios post. It takes care of configuring the network on its K8s node so that network requests to the virtual/external IP address are redirected to the back-end Pod(s). In this way, we have managed to expose an internal Kubernetes Pod based application to the outside world. So not only can clients within the cluster access these resources, but so can clients outside of Kubernetes.

Manifests used in this demo (as well as previous 101 blog posts) can be found on my vsphere-storage-101 github repo.

2 Replies to “Kubernetes Storage on vSphere 101 – NFS revisited”

  1. Great article series Cormac! One annoying thing about StatefulSets is that your NFS server wouldn’t recover on underlying ESXI host failure (something you highlight in your failure scenarios article). For more reliable NFS, vSAN Native File Services couldn’t come soon enough! 🙂

    1. Indeed – but this is not just a vSphere thing. It is behaviour seen on multiple platforms, and is something which is being actively worked on by the K8s community Lukasz.

Comments are closed.