Kubernetes Storage on vSphere 101 – NFS revisited
In my most recent 101 post on ReadWriteMany volumes, I shared an example whereby we created an NFS server in a Pod which automatically exported a File Share. We then mounted the File Share to multiple NFS client Pods deployed in the same namespace. We saw how multiple Pods were able to write to the same ReadWriteMany volume, which was the purpose of the exercise. I received a few questions on the back on that post relating to the use of Services. In particular, could an external NFS client, even one outside of the K8s cluster, access a volume from an NFS Server running in a Pod?
Therefore, in this post, we will look at how to do just that. We will be creating a Service that can be used by external clients to mount a File Share from an NFS Server running in a K8s Pod. To achieve this, we will be looking at a new Service type that we haven’t seen before, the Load Balancer type. This type means that by using this Service, our NFS Server will be associated an External IP address. This should allow our clients to access the NFS exports. If you have a K8s distribution that already has some Container Network Interface (CNI) already deployed that will provide these external IP addresses, e.g. NSX-T, then great. If not, I will introduce you to the MetalLB Load Balancer later in the post which will provide external IP addresses to your Load Balancer Services.
Please note that one would typically use a Load Balancer Service to load balance requests across multiple Pods that are part of a StatefulSet or ReplicaSet. However, we’re not going to delve into that functionality here we are just using it for external access. As I said in previous posts, I may look at doing a post about Services in more detail at some point.
To begin this demonstration, let’s create the NFS Server Pod. Let’s take a look at that now.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nfs-server-ext
namespace: nfs
labels:
app: nfs-server-ext
spec:
serviceName: nfs-service-svc-ext
replicas: 1
selector:
matchLabels:
app: nfs-server-ext
template:
metadata:
labels:
app: nfs-server-ext
spec:
containers:
- name: nfs-server-ext
image: gcr.io/google_containers/volume-nfs:0.8
ports:
- name: nfs
containerPort: 2049a
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- name: nfs-export
mountPath: /exports
volumeClaimTemplates:
- metadata:
name: nfs-export
annotations:
volume.beta.kubernetes.io/storage-class: nfs-sc
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 5Gi
I am deploying my NFS Server as a StatefulSet, but with only a single replica. This simply means that should the Pod fail, the StatefulSet will take care of restarting it, etc. You can review StatefulSets here. What this manifest YAML does is instructs K8s to create a 5GB volume from the storage defined in my StorageClass nfs-sc and present it to the NFS server container as /exports. The NFS Server container (running in the Pod) is configured to automatically export the directory /exports as a File Share. Basically whatever size of volume we add to the manifest with the name nfs-export is automatically exported. I have opened 3 container ports for the NFS server, 2049 for nfs, 20048 for mountd and 111 for the portmapper/rpcbind. These are required for NFS to work. Let’s look at the StorageClass next:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: nfs-sc
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: thin
storagePolicyName: raid-1
datastore: vsanDatastore
This StorageClass is referencing the VCP (vSphere Cloud Provider) storage driver called vsphere-volume. It is stating the volumes should be instantiated on the vSAN datastore, and the policy used should be the Storage Policy called raid-1. Nothing really new here – we have seen this many times in previous 101 posts. If you need to review StorageClasses, you can do that here.
Now we come to the interesting bit – the service. It includes a Load Balancer reference so that it can be assigned an external IP address. I am fortunate in that I have a lab setup with NSX-T, meaning that NSX-T is configured with a Floating IP Pool to provide me with external IP addresses when I need them for Load Balancer service types. Access to NSX-T is not always possible, so in those cases, I have used MetalLB in the past. To deploy a MetalLB load balancer, you can use the following command to create the necessary Kubernetes objects and privileges:
$ kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml namespace/metallb-system created serviceaccount/controller created serviceaccount/speaker created clusterrole.rbac.authorization.k8s.io/metallb-system:controller created clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created role.rbac.authorization.k8s.io/config-watcher created clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created rolebinding.rbac.authorization.k8s.io/config-watcher created daemonset.apps/speaker created deployment.apps/controller created
Once the MetalLB Load Balancer is deployed, it is simply a matter of creating a ConfigMap with a pool of IP address for it to use for any Load Balancer service types that required external IP addresses. Here is a sample ConfigMap YAML that I have used in the past, with a range of external IP addresses configured. Modify the IP address range appropriately for your environment. Remember that these IP addresses must be on a network that can be accessed by your NFS clients, so they can mount the exported filesystems from the NFS Server Pod:
$ cat layer2-config.yamlapiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name:
config
data:
config: |
address-pools:
- name: my-ip-space
protocol: layer2
addresses:
- 10.27.51.172-10.27.51.178
$ kubectl apply -f layer2-config.yaml configmap/config created
Now when your service starts with type Load Balancer, it will be allocated one of the IP addresses for the pool of IP addresses in the ConfigMap.
Since my Container Network Interface (CNI) is NSX-T, I don’t need to worry about that. As soon as I specify type: LoadBalancer in my Service manifest file, NSX-T will retrieve an available address from the preconfigured pool of Floating IP addresses, and allocate it to my service. Here is my manifest for the NFS server service, which opens the same network ports as our NFS Server container running in the Pod:
apiVersion: v1
kind: Service
metadata:
labels:
app: nfs-server-svc-ext
name: nfs-server-svc-ext
namespace: nfs
spec:
ports:
- name: nfs
port: 2049
- name: mountd
port: 20048
- name: rpcbind
port: 111
selector:
app: nfs-server-ext
type: LoadBalancer
Note the selector, and how it matches the label in the Pod. Backing Pods are both discovered and connected using the selector.
The next step is to go ahead and deploy the StorageClass and StatefulSet for the NFS Server. We will verify that the PVC, PV and Pod get created accordingly.
$ kubectl create -f nfs-sc.yaml storageclass.storage.k8s.io/nfs-sc created $ kubectl create -f nfs-server-sts-ext.yaml statefulset.apps/nfs-server-ext created $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-daa8629c-9746-11e9-8893-005056a27deb 5Gi RWO Delete Bound nfs/nfs-export-nfs-server-ext-0 nfs-sc 111m $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nfs-export-nfs-server-ext-0 Bound pvc-daa8629c-9746-11e9-8893-005056a27deb 5Gi RWO nfs-sc 112m $ kubectl get pod NAME READY STATUS RESTARTS AGE nfs-server-ext-0 1/1 Running 0 93s
The Pod is up and running. Now let’s deploy our Load Balancer Service and check the Endpoint get created correctly.
$ kubectl create -f nfs-server-svc-ext-lb.yaml service/nfs-server-svc-ext created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nfs-server-svc-ext LoadBalancer 10.100.200.235 100.64.0.1,192.168.191.67 2049:32126/TCP,20048:31504/TCP,111:30186/TCP 5s $ kubectl get endpoints NAME ENDPOINTS AGE nfs-server-svc-ext 172.16.5.2:20048,172.16.5.2:111,172.16.5.2:2049 9s $ kubectl describe endpoints nfs-server-svc-ext Name: nfs-server-svc-ext Namespace: nfs Labels: app=nfs-server-svc-ext Annotations: <none> Subsets: Addresses: 172.16.5.2 NotReadyAddresses: <none> Ports: Name Port Protocol ---- ---- -------- mountd 20048 TCP rpcbind 111 TCP nfs 2049 TCP Events: <none>
All looks good. I have marked the external IP in blue above. This address, allocated by NSX-T from my floating IP address pool, can be reached from other apps running in my environment, so long as they can reach the 192.168.191.67 IP address. Not only do I get an IP address, but I also get a DNS entry for my service (which is the same name as the service). There is also load balancing of requests to the IP address which redirects any requests across all of the back-end Pods that implement the service (although I only have 1 back-end Pod, so not really relevant here).
It is also useful to query the endpoints since these will only populate once the Pod has mapped/bound successfully to the service. It ensures that your labeling and selector are working correctly between Pod and Service. It also displays the IP address of the NFS Server Pod, and any configured ports.
With the Load Balancer service now created, it should mean that a pod, virtual machine or bare-metal server running NFS client software should be able to mount the share exported from my NFS server Pod.
Now, if you had used the MetalLB Load Balancer, then you would expect the external IP address allocated to the Load Balancer Service to be one of the range of IP addresses placed in the ConfigMap for the MetalLB.
You might ask why I don’t just scale out the StatefulSet to 3 replicas or something and allow the requests to load balance? The thing to keep in mind is that this NFS Server has no built in replication, and each Pod is using its own unique Persistent Volume (PV). So let’s say my client keeps connecting to Pod-0 and writes lots of data to Pod-0’s PV. Now Pod-0 fails so I am now redirected/proxied to Pod-1. Well, Pod-1’s PV will have none of the data that I wrote to POD-0’s PV – it will be empty since there is no replication built into the NFS Server Pod. Note that Kubernetes does not do any replication of data in a ReplicaSet or a StatefulSet – it is up to the application running in the Pods to do this.
Note: I usually use a showmount -e command pointed at the NFS server to see what it shares/directories it is exporting. From my Ubuntu client VM below, you can see that it is not working for the NFS server Pod, but if I point it at another NFS server IP address (in a VM) it works. I’m unsure why it is not working for my NFS Server Pod.
# showmount -e 192.50.0.4 Export list for 192.50.0.4: /nfs * # showmount -e 192.168.191.67 clnt_create: RPC: Port mapper failure - Unable to receive: errno 111 (Connection refused) #
However, the rpcinfo -p command works just fine when pointed at the NFS Server Pod.
# rpcinfo -p 192.168.191.67 program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 3 udp 20048 mountd 100005 3 tcp 20048 mountd 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100227 3 tcp 2049 nfs_acl 100003 3 udp 2049 nfs 100227 3 udp 2049 nfs_acl 100021 1 udp 59128 nlockmgr 100021 3 udp 59128 nlockmgr 100021 4 udp 59128 nlockmgr 100021 1 tcp 43089 nlockmgr 100021 3 tcp 43089 nlockmgr 100021 4 tcp 43089 nlockmgr 100024 1 udp 35175 status 100024 1 tcp 46455 status
And for the final test, can we actually mount the exported share from the NFS server Pod to my client VM sitting outside the cluster?
# mount -t nfs 192.168.191.67:/exports /demo # cd /demo # touch my-new-file # ls index.html lost+found my-new-file #
LGTM. So what is happening here? In my example, my Load Balancer (provided by the NSX-T CNI) has provided an IP address for my Service. As the Service receives NFS client requests on what could be termed a virtual IP address of 192.168.191.67, these requests are being redirected or proxied to our back-end NFS Server Pod on 172.16.5.2. This is all handled by the kube-proxy daemon which we discussed briefly in the failure scenarios post. It takes care of configuring the network on its K8s node so that network requests to the virtual/external IP address are redirected to the back-end Pod(s). In this way, we have managed to expose an internal Kubernetes Pod based application to the outside world. So not only can clients within the cluster access these resources, but so can clients outside of Kubernetes.
Manifests used in this demo (as well as previous 101 blog posts) can be found on my vsphere-storage-101 github repo.
Great article series Cormac! One annoying thing about StatefulSets is that your NFS server wouldn’t recover on underlying ESXI host failure (something you highlight in your failure scenarios article). For more reliable NFS, vSAN Native File Services couldn’t come soon enough! 🙂
Indeed – but this is not just a vSphere thing. It is behaviour seen on multiple platforms, and is something which is being actively worked on by the K8s community Lukasz.