A closer look at Antrea, the new CNI for vSphere with Tanzu guest clusters
I’ve spent quite a bit of time highlighting many of the new features of vSphere with Tanzu in earlier blog posts. In those posts, we saw how vSphere with Tanzu could be used to provision Tanzu Kubernetes Grid (TKG) guest clusters to provide a native, upstream-like, VMware supported Kubernetes. In this post, I want to delve into the guest cluster in more detail and examine the new, default Container Network Interface (CNI) called Antrea that is now shipping with the TKG guest cluster.
Antrea provides networking and security services for a Kubernetes cluster. It is based on the Open vSwitch project (OVS), which is used as the networking data plane to provide network overlays for Pod-to-Pod communication. Another interesting feature of Antrea is network security policies. While Kubernetes does have in-built network policies, Antrea builds on those native network policies to provide more fine-grained network policies of its own. It has a ClusterNetworkPolicy which, as the name implies, operates at the Kubernetes cluster level. It also has NetworkPolicy which limits the scope of a policy to a Kubernetes namespace. The ClusterNetworkPolicy can be thought of as a means for a Kubernetes Cluster Admin to create a security policy for the cluster as a whole. The NetworkPolicy can be thought of as a means for a developer to secure applications in a particular namespace. We will later take a look at how these network policies can be used to secure traffic between Pods in the same workload, and between workloads in the same cluster.
Let’s start with a look at my TKG guest cluster, which I already provisioned from vSphere with Tanzu. It is running Kubernetes v1.18.5, and has been deployed with a single control plane node and two worker nodes. I’ve also displayed the Pods related to Antrea.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION tkg-cluster-1-18-5-control-plane-bsn9l Ready master 21d v1.18.5+vmware.1 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r Ready <none> 21d v1.18.5+vmware.1 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-kfd5z Ready <none> 21d v1.18.5+vmware.1 $ kubectl get pods -A | grep antrea NAMESPACE NAME READY STATUS RESTARTS AGE kube-system antrea-agent-ktn22 2/2 Running 8 21d kube-system antrea-agent-l7gzv 2/2 Running 9 21d kube-system antrea-agent-mt8xt 2/2 Running 11 21d kube-system antrea-controller-846bd89cc6-87bjk 1/1 Running 5 21d
As you can see in this single control plane and 2 work node TKG cluster, there is a single Antrea controller and 3 x Antrea agents. The Antrea Controller watches the Kubernetes API server, and if there are any requests or updates around network policy, or Pod and Namespace networking, the controller distributes the required policies to all Antrea Agents.
In my TKG cluster, I have deployed a Cassandra stateful set and an nginx deployment. The ClusterIP has been provided by Kubernetes (the HA-Proxy Load Balancer deployed with vSphere with Tanzu provides the External IP for the nginx web server).
$ kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra cassandra ClusterIP 10.100.48.126 <none> 9042/TCP 20d default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20d default nginx-svc LoadBalancer 10.108.137.209 10.27.62.18 443:30472/TCP,80:30915/TCP 20d default supervisor ClusterIP None <none> 6443/TCP 20d kube-system antrea ClusterIP 10.96.120.159 <none> 443/TCP 20d kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 20d $
With this information, let’s take a closer look at the containers that make up one of the Antrea agents. I’ll use my old trick of requesting logs from a Pod, which fails but at the same time informs me of the containers that make up the Pod. As you can see below, there is an antrea-agent and an antrea-ovs container in the Pod. To begin, let’s take a look at the antrea-ovs container. As you can see there are a number of OVS related commands available. The Open vSwitch documentation provides more information about the commands.
$ kubectl logs antrea-agent-ktn22 -n kube-system error: a container name must be specified for pod antrea-agent-ktn22, \ choose one of: [antrea-agent antrea-ovs] or one of the init containers: [install-cni] $ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-ovs -- bash root [ / ]# ovs- ovs-appctl ovs-dpctl ovs-ofctl ovs-pki ovs-test ovs-vsctl ovs-bugtool ovs-dpctl-top ovs-parse-backtrace ovs-tcpdump ovs-testcontroller ovs-vswitchd ovs-docker ovs-l3ping ovs-pcap ovs-tcpundump ovs-vlan-test root [ / ]# ovs-vsctl show 83ed275e-084b-4ebb-a19e-f150646bf62b Bridge br-int datapath_type: system Port antrea-gw0 Interface antrea-gw0 type: internal Port vsphere--940a85 Interface vsphere--940a85 Port nginx-de-548af9 Interface nginx-de-548af9 Port antrea-tun0 Interface antrea-tun0 type: geneve options: {csum="true", key=flow, remote_ip=flow} Port cassandr-1e6e0a Interface cassandr-1e6e0a Port nginx-de-481c4a Interface nginx-de-481c4a ovs_version: "2.13.1"
The ovs-vsctl command output above allows us to interact with the Open vSwitch (OVS) configuration database. Above is a brief overview of the database contents. Of note in the above output is antrea-tun0. When packets from one of the ‘local’ pods on this node are sent to a pod on another node, they are forwarded to the antrea-tun0 port on the ‘local’ OVS. The packets are sent to the destination node through the tunnel (implemented using geneve above). On the destination node, the packets are received via the corresponding antrea-tun0 port and forwarded onto the destination pod.
Also interesting is how packets are routed externally. Packets sent to an external IP on the node’s network are forwarded to the antrea-gw0 port, routed to the appropriate network interface of the Kubernetes node and sent out to the network from there. The Antrea Kubernetes Networking whitepaper has great detail on all of these inner workings – check it out if you want to learn more.
Let’s turn our attention to the other Antrea container next – antrea-agent. This has another really useful utility, antctl. In the output below, I am using it to examine the Pods on this node/agent that are using the OVS.
$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- bash root [ / ]# root [ / ]# antctl get Get the status or resource of a topic Usage: antctl get [command] Available Commands: addressgroup Print address groups agentinfo Print agent's basic information appliedtogroup Print appliedto groups networkpolicy Print NetworkPolicies ovsflows Dump OVS flows podinterface Print Pod's network interface information Flags: -h, --help help for get Global Flags: -k, --kubeconfig string absolute path to the kubeconfig file -s, --server string address and port of the API server, taking precedence over the default endpoint and the one set in kubeconfig -t, --timeout duration time limit of the execution of the command -v, --verbose enable verbose output Use "antctl get [command] --help" for more information about a command. root [ / ]# root [ / ]# antctl get podinterface NAMESPACE NAME INTERFACE-NAME IP MAC PORT-UUID OF-PORT CONTAINER-ID cassandra cassandra-1 cassandr-1e6e0a 192.168.2.6 4e:c8:62:fb:8b:75 9fd9bdc3-5fd1-4c4a-b46a-d4e35af0f8da 7 ac559b874be default nginx-deployment-cc7df4f8f-69cm2 nginx-de-481c4a 192.168.2.2 32:a1:45:43:7c:96 bf184406-a53c-4b3d-84c7-a979f435e162 3 f3d429a49c6 default nginx-deployment-cc7df4f8f-pcn79 nginx-de-548af9 192.168.2.4 f2:38:81:3e:e0:08 54b7032b-f513-4a2b-9bc8-924b2dcb620f 4 52c08818eeb vmware-system-csi vsphere-csi-node-86fds vsphere--940a85 192.168.2.3 d2:95:fc:3a:1e:a1 352c776b-1bcc-41d9-a961-50039e804689 5 08a976b1c76
As you can see, lots of good information such as IP address and MAC address. We can also use this command to display OVS flows. OVS flows are used to implement the NetworkPolicies for the “local” Pods. If we use the -h (help) option as shown in the first command, we can obtain a list of all flows. In the second command, I dump the flows for one particular local Pod, so we can observe the type of flows associated with it. I have added some colours to the output so we can easily identify the flows numbers and their descriptions.
root [ / ]# antctl get ovsflows -h Dump all the OVS flows or the flows installed for the specified entity. Usage: antctl get ovsflows [flags] Aliases: ovsflows, of Examples: Dump all OVS flows $ antctl get ovsflows Dump OVS flows of a local Pod $ antctl get ovsflows -p pod1 -n ns1 Dump OVS flows of a NetworkPolicy $ antctl get ovsflows --networkpolicy np1 -n ns1 Dump OVS flows of a flow Table $ antctl get ovsflows -T IngressRule Antrea OVS Flow Tables: 0 Classification 5 Uplink 10 SpoofGuard 20 ARPResponder 29 ServiceHairpin 30 ConntrackZone 31 ConntrackState 40 DNAT(SessionAffinity) 40 SessionAffinity 41 ServiceLB 42 EndpointDNAT 45 CNPEmergencyEgressRule 46 CNPSecurityOpsEgressRule 47 CNPNetworkOpsEgressRule 48 CNPPlatformEgressRule 49 CNPApplicationEgressRule 50 EgressRule 60 EgressDefaultRule 70 l3Forwarding 80 L2Forwarding 85 CNPEmergencyIngressRule 86 CNPSecurityOpsIngressRule 87 CNPNetworkOpsIngressRule 88 CNPPlatformIngressRule 89 CNPApplicationIngressRule 90 IngressRule 100 IngressDefaultRule 105 ConntrackCommit 106 HairpinSNATTable 110 Output Flags: -h, --help help for ovsflows -n, --namespace string Namespace of the entity --networkpolicy string NetworkPolicy name. If present, Namespace must be provided. -o, --output string output format: json|table|yaml (default "table") -p, --pod string Name of a local Pod. If present, Namespace must be provided. -T, --table string Antrea OVS flow table name or number Global Flags: -k, --kubeconfig string absolute path to the kubeconfig file -s, --server string address and port of the API server, taking precedence over the default endpoint and the one set in kubeconfig -t, --timeout duration time limit of the execution of the command -v, --verbose enable verbose output root [ / ]# root [ / ]# antctl get ovsflows -n default -p nginx-deployment-cc7df4f8f-69cm2 FLOW table=70, n_packets=3, n_bytes=222, priority=200,ip,dl_dst=aa:bb:cc:dd:ee:ff,nw_dst=192.168.2.2 actions=set_field:46:05:cf:f0:69:b3->eth_src,set_field:32:a1:45:43:7c:96->eth_dst,dec_ttl,goto_table:80 table=0, n_packets=498, n_bytes=34716, priority=190,in_port="nginx-de-481c4a" actions=load:0x2->NXM_NX_REG0[0..15],goto_table:10 table=10, n_packets=5, n_bytes=270, priority=200,ip,in_port="nginx-de-481c4a",dl_src=32:a1:45:43:7c:96,nw_src=192.168.2.2 actions=goto_table:30 table=10, n_packets=5, n_bytes=210, priority=200,arp,in_port="nginx-de-481c4a",arp_spa=192.168.2.2,arp_sha=32:a1:45:43:7c:96 actions=goto_table:20 table=80, n_packets=5, n_bytes=370, priority=200,dl_dst=32:a1:45:43:7c:96 actions=load:0x3->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[16],goto_table:90 root [ / ]#
In the flow tables above, table 70 defines the L3 forwarding and table 80 defines the L2 forwarding. I added some colors them easier to distinguish. Table 10 are spoof-guards. This prevents IP and ARP spoofing from local Pods. To get more details about each of the different flow tables, check out this OVS pipeline document which has detailed information about each table.
Other commands which might be interesting to run in this antrea-agent container are to display the agent information, as well as display information about network policy. We have not yet created a network policy so there is nothing very useful here yet. We will return to this shortly.
root [ / ]# antctl get agentinfo POD NODE STATUS NODE-SUBNET NETWORK-POLICIES ADDRESS-GROUPS APPLIED-TO-GROUPS LOCAL-PODS kube-system/antrea-agent-ktn22 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r Healthy 192.168.2.0/24 0 0 0 4 root [ / ]#
OK – let’s quit the container and return to my desktop. Let’s run a describe against one of the Antrea agents so we can observe some similar flow table information, albeit not at the granularity of a Pod as seen previously. I’ve snipped some of the output to make it easier to read. This is the control plane agent.
$ kubectl describe antreaagentinfos tkg-cluster-1-18-5-control-plane-bsn9l Name: tkg-cluster-1-18-5-control-plane-bsn9l Namespace: Labels: <none> Annotations: <none> Agent Conditions: Last Heartbeat Time: 2020-11-12T17:35:35Z Status: True Type: AgentHealthy <--snip--> Kind: AntreaAgentInfo Local Pod Num: 5 Metadata: Creation Timestamp: 2020-10-23T08:25:23Z Generation: 29285 Managed Fields: API Version: clusterinformation.antrea.tanzu.vmware.com/v1beta1 <--snip--> Manager: antrea-agent Operation: Update Time: 2020-11-12T17:35:35Z Resource Version: 7476901 Self Link: /apis/clusterinformation.antrea.tanzu.vmware.com/v1beta1/antreaagentinfos/tkg-cluster-1-18-5-control-plane-bsn9l UID: 8e1bd7c7-0c3a-4f00-a405-3ef2f0b1d8f1 Network Policy Controller Info: Node Ref: Kind: Node Name: tkg-cluster-1-18-5-control-plane-bsn9l Node Subnet: 192.168.0.0/24 Ovs Info: Bridge Name: br-int Flow Table: 0: 8 10: 13 100: 1 105: 3 110: 2 20: 4 30: 1 31: 4 40: 2 50: 2 60: 1 70: 9 80: 7 90: 3 Version: 2.13.1 Pod Ref: Kind: Pod Name: antrea-agent-l7gzv Namespace: kube-system Version: v0.9.2-unknown Events: <none>
Again, we see the list of OVF flow tables defined by Antrea, such as spoof-guard tables (10) , as well as L2 (80) and L3 (70) forwarding tables, among others. We also see the number of packet-processing flows in each table. Let’s now focus our attention on the other features of Antrea, namely NetworkPolicies. There is a considerable amount of information on how to configure Network Policies available here (note that this is for Antrea native policies, which are current not available in TKG guest clusters, but still has useful information for reference purposes). For this part of the post, I am going to leverage a great example provided by Curtis Collicutt. Kudos Curtis!
As already mentioned, and shown throughout the post, I have a simple nginx deployment in my default namespace. I am now going to deploy a simple busybox Pod in the same namespace and show how access between Pods is currently wide open, implying this newly deployed Pod has direct access to the nginx web server via its ClusterIP address.
$ kubectl get svc nginx-svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx-svc LoadBalancer 10.108.137.209 10.27.62.18 443:30472/TCP,80:30915/TCP 20d $ kubectl get pods NAME READY STATUS RESTARTS AGE IP nginx-deployment-cc7df4f8f-69cm2 1/1 Running 2 20d nginx-deployment-cc7df4f8f-ms5jm 1/1 Running 2 20d nginx-deployment-cc7df4f8f-pcn79 1/1 Running 2 20d $ kubectl apply -f busybox-cor.yaml pod/ch-busybox created $ kubectl get pods NAME READY STATUS RESTARTS AGE ch-busybox 1/1 Running 0 3m13s nginx-deployment-cc7df4f8f-69cm2 1/1 Running 2 20d nginx-deployment-cc7df4f8f-ms5jm 1/1 Running 2 20d nginx-deployment-cc7df4f8f-pcn79 1/1 Running 2 20d $ kubectl exec -it ch-busybox -- sh / # wget -q -O - 10.108.137.209 | grep title <title>Welcome to nginx!</title> / # exit $
At this point in time, any Pods running in the cluster have access to all other Pods. This is, most likely, not desirable. The next objective is to create a NetworkPolicy with an ingress rule which prevents the busybox Pod from accessing the nginx web server. The network policy we create using the manifest below will only allow Pods with a matching label (nginx) to have ingress access to the nginx Pods.
kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: nginx-allow spec: podSelector: matchLabels: app: nginx ingress: - from: - podSelector: matchLabels: app: nginx
Obviously this is a very simple policy with a single ingress rule. You can imagine a much more detailed policy with multiple ingress and egress rules, controlling whether different types of traffic should be allowed allowed or dropped between Pods in the namespace.
Let’s now apply this network policy and see if the busybox Pod can still access the nginx web server.
$ kubectl apply -f antrea-network-policy.yaml networkpolicy.networking.k8s.io/nginx-allow created $ kubectl get networkpolicy NAME POD-SELECTOR AGE nginx-allow app=nginx 8s $ kubectl exec -it ch-busybox -- sh / # wget -q -O - 10.108.137.209 | grep title wget: can't connect to remote host (10.108.137.209): Connection timed out / #
As we can see, the busybox Pod can no longer access the nginx web service. Pretty cool, huh?
Now, just to prove a point, we can edit the busybox manifest to add a label (app=nginx) which allow it to communicate to the web server once again, since the ingress network policy should allow any Pods with this label to access the web server. The first command below shows that we have added the label. Then we connect to the Pod once more and retry the wget command.
$ kubectl get pod ch-busybox -L app NAME READY STATUS RESTARTS AGE APP ch-busybox 1/1 Running 0 16m nginx $ kubectl exec -it ch-busybox -- sh / # wget -q -O - 10.108.137.209 | grep title <title>Welcome to nginx!</title> / #
Nice! And now if we use the antctl command seen earlier, we can look at the new Ingress rules (Flow Table 90) in the Antrea agent container.
$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- antctl get ovsflows -T ingressRule FLOW table=90, n_packets=11666344, n_bytes=1394912954, priority=210,ct_state=-new+est,ip actions=goto_table:105 table=90, n_packets=7, n_bytes=518, priority=210,ip,nw_src=192.168.2.1 actions=goto_table:105 table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.2 actions=conjunction(1,1/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.4 actions=conjunction(1,1/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.1.2 actions=conjunction(1,1/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.7 actions=conjunction(1,1/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x3 actions=conjunction(1,2/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x4 actions=conjunction(1,2/2) table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x8 actions=conjunction(1,2/2) table=90, n_packets=1, n_bytes=74, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG6[],goto_table:105 table=90, n_packets=586, n_bytes=43364, priority=0 actions=goto_table:100
The network sources highlighted above, with ingress rules, can be tied directly to the IP addresses of the Pods.
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ch-busybox 1/1 Running 0 24m 192.168.2.7 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r <none> <none> nginx-deployment-cc7df4f8f-69cm2 1/1 Running 2 20d 192.168.2.2 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r <none> <none> nginx-deployment-cc7df4f8f-ms5jm 1/1 Running 2 20d 192.168.1.2 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-kfd5z <none> <none> nginx-deployment-cc7df4f8f-pcn79 1/1 Running 2 20d 192.168.2.4 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r <none> <none>
Let’s try a slightly different approach this time. Let’s give the busybox Pod a different label (busybox), and add this as another entry to the Network Policy manifest to allow Pods with another label to communicate to one another.
$ kubectl get pod ch-busybox -L app NAME READY STATUS RESTARTS AGE APP ch-busybox 1/1 Running 70 2d22h busybox $ kubectl exec -it ch-busybox -- sh / # wget -q -O - 10.108.137.209 | grep title wget: can't connect to remote host (10.108.137.209): Connection timed out / #
As before, only Pods with an app=nginx have an ingress rule to communicate to one another. Let’s add another rule to the Network Policy that also allows ingress for app=busybox.
kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: nginx-allow spec: podSelector: matchLabels: app: nginx ingress: - from: - podSelector: matchLabels: app: nginx - podSelector: matchLabels: app: busybox
Let’s apply the changes and repeat our test.
$ kubectl apply -f antrea-network-policy.yaml networkpolicy.networking.k8s.io/nginx-allow configured $ kubectl exec -it ch-busybox -- sh / # wget -q -O - 10.108.137.209 | grep title <title>Welcome to nginx!</title> / #
And the ingress for the Pod with a busybox label is now working. Nice! One final item to show is that the network policy can also be queried via antctl on the Antrea agent container in the Antrea agent Pod.
$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- antctl get networkpolicy NAMESPACE NAME APPLIED-TO RULES default nginx-allow 766a9e51-f132-5c2f-b862-9ac68e75d77d 1
Hopefully that has given you a decent idea on how simple but powerful the network policies are in the new Antrea CNI in vSphere with Tanzu TKG Guest clusters. I’ve only just started to experiment, and I’m aware that there are a number of new additional innovations coming down the line. No doubt, I’ll be revisiting Antrea to share more goodness with you about that at some point.
Last, I’d like to extend a thanks to Antonin Bas from our Antrea team for providing much of the insight shared here. Merci beaucoup Antonin.
Hello, can i check can we use these network policy to guard between tkc to tkc communication? or VM to tkc ingress communication? If so how can we use network policy to apply the rules to the nsxt gfw of the respective tkc T1 router?
I’m not sure you need to use Antrea to achieve this goal. When you deploy vSphere with Tanzu, the Workload Management step allows you to create multiple workload networks. When you deploy the TKG cluster, simply place the different TKG clusters on their own separate segments. This will achieve your goal, and can also isolate TKG clusters from other VM workloads in your environment.