Network Policies in Tanzu Mission Control revisited
Earlier this month, I had my first look at network policies in Tanzu Mission Control (TMC). This earlier post looked at a very simple network policy where I used a web server app, and showed how we could control access to it from other pods by using labels. In this post, I wanted to do something that is a bit more detailed. For the purposes of this test, I will use a pod based NFS server, and then control access to it from other pods who wish to mount the NFS file share from the server pod. I have already created the workspace in TMC, and attached the namespace where the NFS server and client pods are to be deployed. All of the details on how to do this, including the creation of a policy, are in the previous post so please refer to that post for those instructions. In this post, we will focus on how a network policy can control the communication between the NFS client and NFS server pods.
The yaml manifests used for the NFS server and client pods are available here, along with manifest for creating the server service, and client PVC and PV. I initially deployed this app in the namespace nfs-testing without any policy configured, and verified that I could successfully mount the NFS share exported from the NFS server pod on all of the NFS client pods. I was also able to do other communication between the NFS server and client pods such as ping requests. I then deleted the NFS client pods and just left the NFS server pod in place. Note the endpoints. This is where requests to the service via ClusterIP (10.69.133.143) are routed by kube-proxy. Thus, when we request an NFS shared to be mounted from the service IP address, it is routed to the NFS server pod on IP address 10.96.1.101.
% kubectl get pods -n nfs-testing -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nfs-server-0 1 Running 0 157m 100.96.1.214 workload2-md-0-859f9d5496-zjlzj.corinternal.com <none> <none> nfssrvr % kubectl get svc -n nfs-testing NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nfs-server ClusterIP 100.69.133.143 <none> 2049/TCP,20048/TCP,111/TCP 12m % kubectl get endpoints -n nfs-testing NAME ENDPOINTS AGE nfs-server 100.96.1.101:20048,100.96.1.101:111,100.96.1.101:2049 13m
Create a custom ingress policy in TMC
As mentioned, I wish to control who can access the NFS server pod. To do that I will create a custom ingress policy, and so that this policy only applies to my NFS server pod, I will add a pod selector which matches my NFS server pod (nfssrvr). A final step is to add a rule about who can access my NFS server pod ingress, i.e. the NFS client pods. I only want NFS client pods to access the ingress of the NFS server pod, basically pods in the 10.96.0.0/16 address range which is the CIDR that is shared by both NFS client and server pods. I will now login onto TMC, and create the rule with those requirements.
After creating the policy in TMC, it becomes visible in the nfs-testing namespace. Note the PodSelector and the CIDR which match what was added to the policy above.
% kubectl describe networkpolicy -n nfs-testing Name: tmc.wsp.nfs-testing.control-nfs-ingress Namespace: nfs-testing Created on: 2021-11-16 15:14:30 +0000 GMT Labels: tmc.cloud.vmware.com/managed=true Annotations: <none> Spec: PodSelector: app=nfssrvr Allowing ingress traffic: To Port: <any> (traffic allowed to all ports) From: IPBlock: CIDR: 100.96.0.0/16 Except: Not affecting egress traffic Policy Types: Ingress
This policy should continue to allow the NFS clients to successfully start up, since this policy allows pod to pod communication between the NFS client and NFS server.
% kubectl get pods -o wide -n nfs-testing -L app -w NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nfs-client-pod-1 0/1 ContainerCreating 0 3s workload2-md-0-859f9d5496-djgsr.corinternal.com nfs1 nfs-client-pod-2 0/1 ContainerCreating 0 2s workload2-md-0-859f9d5496-zjlzj.corinternal.com nfs2 nfs-client-pod-3 0/1 ContainerCreating 0 2s workload2-md-0-859f9d5496-djgsr.corinternal.com nfs3 nfs-server-0 1/1 Running 0 60m 100.96.1.214 workload2-md-0-859f9d5496-zjlzj.corinternal.com nfssrvr nfs-client-pod-1 1/1 Running 0 5s 100.96.2.193 workload2-md-0-859f9d5496-djgsr.corinternal.com nfs1 nfs-client-pod-2 1/1 Running 0 5s 100.96.1.17 workload2-md-0-859f9d5496-zjlzj.corinternal.com nfs2 nfs-client-pod-3 1/1 Running 0 5s 100.96.2.194 workload2-md-0-859f9d5496-djgsr.corinternal.com nfs3
The NFS client pods appear to have started successfully. Let’s see if they have successfully mounted the NFS share (onto the /nfs directory mount-point in the pod).
% kubectl exec -it nfs-client-pod-1 -n nfs-testing -- sh / # df /nfs Filesystem 1K-blocks Used Available Use% Mounted on 100.69.133.143:/exports 5095424 20480 4796416 0% /nfs / # cd /nfs /nfs # echo "hello-world" > hello /nfs # cat hello hello-world /nfs #
Looks like everything is working as expected, and the NFS clients are still able to mount the NFS share from the NFS server. Let’s do some more tests on the policy.
Add an Exclude IP range
From the pod listing in the previous step, we can see that the NFS client pods came up on two different ranges – 10.96.1.0 and 10.96.2.0. As a test, let’s delete the client pods, add an Exclude IP range to the rule and set it to 100.96.2.0/24. Now let’s see what happens when we try to deploy the NFS client pods once more.
Here is the updated policy with the excluded IP address range added.
Let’s examine the policy from the TKG cluster perspective.
% kubectl describe networkpolicy -n nfs-testing Name: tmc.wsp.nfs-testing.control-nfs-ingress Namespace: nfs-testing Created on: 2021-11-16 15:14:30 +0000 GMT Labels: tmc.cloud.vmware.com/managed=true Annotations: <none> Spec: PodSelector: app=nfssrvr Allowing ingress traffic: To Port: <any> (traffic allowed to all ports) From: IPBlock: CIDR: 100.96.0.0/16 Except: 100.96.2.0/24 Not affecting egress traffic Policy Types: Ingress
Now when the NFS client pods are deployed, we can see that only the NFS client pod on the 100.96.1.0/24 network come online, and the two NFS client pods on the 100.96.2.0/24 network are stuck in ContainerCreating. The reason they are stuck is that they no longer have access to the NFS server pod to mount the share due to that IP range being excluded in the rule.
% kubectl apply -f nfs-client-pod-1.yaml -f nfs-client-pod-2.yaml -f nfs-client-pod-3.yaml pod/nfs-client-pod-1 created pod/nfs-client-pod-2 created pod/nfs-client-pod-3 created % kubectl get pods -n nfs-testing -o wide -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nfs-client-pod-1 0/1 ContainerCreating 0 16s workload2-md-0-859f9d5496-djgsr.corinternal.com nfs1 nfs-client-pod-2 1/1 Running 0 16s 100.96.1.127 workload2-md-0-859f9d5496-zjlzj.corinternal.com nfs2 nfs-client-pod-3 0/1 ContainerCreating 0 15s workload2-md-0-859f9d5496-djgsr.corinternal.com nfs3 nfs-server-0 1/1 Running 0 174m 100.96.1.214 workload2-md-0-859f9d5496-zjlzj.corinternal.com nfssrvr
After approximately 2 minutes, the NFS mount attempt should time out, and the following events will be observable on the pod.
% kubectl get events -A | grep nfs-client-pod-3 nfs-testing 2m14s Normal Scheduled pod/nfs-client-pod-3 \ Successfully assigned nfs-testing/nfs-client-pod-3 to workload2-md-0-859f9d5496-zjlzj.corinternal.com nfs-testing 11s Warning FailedMount pod/nfs-client-pod-3 \ Unable to attach or mount volumes: unmounted volumes=[nfs-vol], unattached volumes=[default-token-xpmgh nfs-vol]: timed out waiting for the condition
We can go back and modify the policy so that it allows pods on the 100.96.2.0/24 network to access the ingress of the NFS server pod. Simply changing the exclude range from 100.96.2.0/24 to 100.96.3.0/24 will achieve this.
% kubectl describe networkpolicy -n nfs-testing Name: tmc.wsp.nfs-testing.control-nfs-ingress Namespace: nfs-testing Created on: 2021-11-16 15:14:30 +0000 GMT Labels: tmc.cloud.vmware.com/managed=true Annotations: <none> Spec: PodSelector: app=nfssrvr Allowing ingress traffic: To Port: <any> (traffic allowed to all ports) From: IPBlock: CIDR: 100.96.0.0/16 Except: 100.96.3.0/24 Not affecting egress traffic Policy Types: Ingress % kubectl get pods -o wide -n nfs-testing -L app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nfs-client-pod-1 1/1 Running 0 3m45s 100.96.2.197 workload2-md-0-859f9d5496-djgsr.corinternal.com <none> <none> nfs1 nfs-client-pod-2 1/1 Running 0 3m44s 100.96.2.198 workload2-md-0-859f9d5496-djgsr.corinternal.com <none> <none> nfs2 nfs-client-pod-3 1/1 Running 0 3m43s 100.96.1.24 workload2-md-0-859f9d5496-zjlzj.corinternal.com <none> <none> nfs3 nfs-server-0 1/1 Running 0 68m 100.96.1.214 workload2-md-0-859f9d5496-zjlzj.corinternal.com <none> <none> nfssrvr
Add Port Rules
With the present policy, we are allowing not only NFS port access to the NFS server pod ingress, but in fact all ports are open on the NFS server pod from the NFS client pods. Now, we will add a rule that controls which ports are accessible on the NFS server (ingress) pod. Without any rules specified, all ports are available. A very simple example would be to show that ping is currently available between the client and the server (and vice-versa).
% kubectl exec -it nfs-client-pod-1 -n nfs-testing -- sh / # ping 100.96.1.214 PING 100.96.1.214 (100.96.1.214): 56 data bytes 64 bytes from 100.96.1.214: seq=0 ttl=62 time=0.982 ms 64 bytes from 100.96.1.214: seq=1 ttl=62 time=0.607 ms 64 bytes from 100.96.1.214: seq=2 ttl=62 time=0.271 ms ^C --- 100.96.1.214 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.271/0.620/0.982 ms / # exit % kubectl exec -it nfs-server-0 -n nfs-testing -- sh sh-4.2# ping 100.96.2.218 PING 100.96.2.218 (100.96.2.218) 56(84) bytes of data. 64 bytes from 100.96.2.218: icmp_seq=1 ttl=62 time=0.962 ms 64 bytes from 100.96.2.218: icmp_seq=2 ttl=62 time=0.417 ms 64 bytes from 100.96.2.218: icmp_seq=3 ttl=62 time=0.278 ms ^C --- 100.96.2.218 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2014ms rtt min/avg/max/mdev = 0.278/0.552/0.962/0.295 ms sh-4.2#
Let’s now go ahead and only allow ingress on the NFS specific ports over TCP, namely port 111, 2049 and 20048. The ports required for NFS are defined in the NFS server service, and can be queried as follows.
% kubectl get svc nfs-server -n nfs-testing NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nfs-server ClusterIP 100.69.133.143 <none> 2049:30331/TCP,20048:31149/TCP,111:31083/TCP 3h7m
By way of testing, let’s add access to only one of those ports initially. Let’s use port 111.
Let’s check the policy in the namespace:
% kubectl describe networkpolicy -n nfs-testing Name: tmc.wsp.nfs-testing.control-nfs-ingress Namespace: nfs-testing Created on: 2021-11-16 15:14:30 +0000 GMT Labels: tmc.cloud.vmware.com/managed=true Annotations: <none> Spec: PodSelector: app=nfssrvr Allowing ingress traffic: To Port: 111/TCP From: IPBlock: CIDR: 100.96.0.0/16 Except: 100.96.3.0/24 Not affecting egress traffic Policy Types: Ingress
With this policy in place, all of the NFS client pods get stuck on ContainerCreating state as they are unable to mount the NFS file share from the NFS server. We need to include all 3 of the ports in the rules for it to work.
% kubectl describe networkpolicy -n nfs-testing Name: tmc.wsp.nfs-testing.control-nfs-ingress Namespace: nfs-testing Created on: 2021-11-16 15:14:30 +0000 GMT Labels: tmc.cloud.vmware.com/managed=true Annotations: <none> Spec: PodSelector: app=nfssrvr Allowing ingress traffic: To Port: 111/TCP To Port: 2049/TCP To Port: 20048/TCP From: IPBlock: CIDR: 100.96.0.0/16 Except: 100.96.3.0/24 Not affecting egress traffic Policy Types: Ingress
And if we delete and recreate the NFS client pods with the port rules in place, we should hopefully see the pods start successfully.
% kubectl get pods -o wide -n nfs-testing -L app -w NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES APP nfs-client-pod-1 1/1 Running 0 2m25s 100.96.2.200 workload2-md-0-859f9d5496-djgsr.corinternal.com <none> <none> nfs1 nfs-client-pod-2 1/1 Running 0 2m24s 100.96.2.201 workload2-md-0-859f9d5496-djgsr.corinternal.com <none> <none> nfs2 nfs-client-pod-3 1/1 Running 0 2m23s 100.96.1.31 workload2-md-0-859f9d5496-zjlzj.corinternal.com <none> <none> nfs3 nfs-server-0 1/1 Running 0 75m 100.96.1.214 workload2-md-0-859f9d5496-zjlzj.corinternal.com <none> <none> nfssrvr
Success! All the pods have come online. This should mean that they are successfully able to mount the NFS share from the NFS server. Let’s login to one of the clients and make sure.
% kubectl exec -it nfs-client-pod-1 -n nfs-testing -- sh / # df /nfs Filesystem 1K-blocks Used Available Use% Mounted on 10.69.133.143:/exports 5095424 20480 4796416 0% /nfs / # ls /nfs hello lost+found / # cat /nfs/hello hello-world / # echo "hello-again" >> /nfs/hello-again / # cat /nfs/hello-again hello-again / #
It looks like everything is working as expected. Now the NFS server should only be allowing ingress on the NFS ports, and nothing else. Let’s try a quick ping from a client to the server to verify.
/ # ping 100.96.1.214 PING 100.96.1.214 (100.96.1.214): 56 data bytes ^C --- 100.96.1.214 ping statistics --- 19 packets transmitted, 0 packets received, 100% packet loss
Pings are now blocked between the NFS client and the NFS server. However, the ingress network policy has only been implemented on the NFS server, not on the NFS clients, since we used the PodSelector in the policy to specify that. Thus, it should still be possible to ping from the server to any of the clients. Let’s try:
% kubectl exec -it nfs-server-0 -n nfs-testing -- sh sh-4.2# ping 100.96.2.220 PING 100.96.2.220 (100.96.2.220) 56(84) bytes of data. 64 bytes from 100.96.2.220: icmp_seq=1 ttl=62 time=0.924 ms 64 bytes from 100.96.2.220: icmp_seq=2 ttl=62 time=0.384 ms 64 bytes from 100.96.2.220: icmp_seq=3 ttl=62 time=0.271 ms 64 bytes from 100.96.2.220: icmp_seq=4 ttl=62 time=0.243 ms ^C --- 100.96.2.220 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3054ms rtt min/avg/max/mdev = 0.243/0.455/0.924/0.276 ms sh-4.2#
Yes – that still works since it is not blocked by any network rules. Everything is working as expected.
Tips
Note: When creating rules, TMC won’t report directly that it failed to create it. All that will happen is that the rule with not appear in the namespace as expected. If this happens, click on the Policies > Insights view, and check the network for issues. If there is a problem with the policy, it will be reported here: