A closer look at Antrea, the new CNI for vSphere with Tanzu guest clusters

Cormac

4 years ago

I’ve spent quite a bit of time highlighting many of the new features of vSphere with Tanzu in earlier blog posts. In those posts, we saw how vSphere with Tanzu could be used to provision Tanzu Kubernetes Grid (TKG) guest clusters to provide a native, upstream-like, VMware supported Kubernetes. In this post, I want to delve into the guest cluster in more detail and examine the new, default Container Network Interface (CNI) called Antrea that is now shipping with the TKG guest cluster.

Antrea provides networking and security services for a Kubernetes cluster. It is based on the Open vSwitch project (OVS), which is used as the networking data plane to provide network overlays for Pod-to-Pod communication. Another interesting feature of Antrea is network security policies. While Kubernetes does have in-built network policies, Antrea builds on those native network policies to provide more fine-grained network policies of its own. It has a ClusterNetworkPolicy which, as the name implies, operates at the Kubernetes cluster level. It also has NetworkPolicy which limits the scope of a policy to a Kubernetes namespace. The ClusterNetworkPolicy can be thought of as a means for a Kubernetes Cluster Admin to create a security policy for the cluster as a whole. The NetworkPolicy can be thought of as a means for a developer to secure applications in a particular namespace. We will later take a look at how these network policies can be used to secure traffic between Pods in the same workload, and between workloads in the same cluster.

Let’s start with a look at my TKG guest cluster, which I already provisioned from vSphere with Tanzu. It is running Kubernetes v1.18.5, and has been deployed with a single control plane node and two worker nodes. I’ve also displayed the Pods related to Antrea.

$ kubectl get nodes
NAME                                                STATUS   ROLES    AGE   VERSION
tkg-cluster-1-18-5-control-plane-bsn9l              Ready    master   21d   v1.18.5+vmware.1
tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r   Ready    <none>   21d   v1.18.5+vmware.1
tkg-cluster-1-18-5-workers-8zj27-67c98696dc-kfd5z   Ready    <none>   21d   v1.18.5+vmware.1

$ kubectl get pods -A | grep antrea
NAMESPACE         NAME                                    READY   STATUS    RESTARTS   AGE
kube-system       antrea-agent-ktn22                      2/2     Running   8          21d
kube-system       antrea-agent-l7gzv                      2/2     Running   9          21d
kube-system       antrea-agent-mt8xt                      2/2     Running   11         21d
kube-system       antrea-controller-846bd89cc6-87bjk      1/1     Running   5          21d

As you can see in this single control plane and 2 work node TKG cluster, there is a single Antrea controller and 3 x Antrea agents. The Antrea Controller watches the Kubernetes API server, and if there are any requests or updates around network policy, or Pod and Namespace networking, the controller distributes the required policies to all Antrea Agents.

In my TKG cluster, I have deployed a Cassandra stateful set and an nginx deployment. The ClusterIP has been provided by Kubernetes (the HA-Proxy Load Balancer deployed with vSphere with Tanzu provides the External IP for the nginx web server).

$ kubectl get svc -A

NAMESPACE     NAME         TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
cassandra     cassandra    ClusterIP      10.100.48.126    <none>        9042/TCP                     20d
default       kubernetes   ClusterIP      10.96.0.1        <none>        443/TCP                      20d
default       nginx-svc    LoadBalancer   10.108.137.209   10.27.62.18   443:30472/TCP,80:30915/TCP   20d
default       supervisor   ClusterIP      None             <none>        6443/TCP                     20d
kube-system   antrea       ClusterIP      10.96.120.159    <none>        443/TCP                      20d
kube-system   kube-dns     ClusterIP      10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP       20d
$

With this information, let’s take a closer look at the containers that make up one of the Antrea agents. I’ll use my old trick of requesting logs from a Pod, which fails but at the same time informs me of the containers that make up the Pod. As you can see below, there is an antrea-agent and an antrea-ovs container in the Pod. To begin, let’s take a look at the antrea-ovs container. As you can see there are a number of OVS related commands available. The Open vSwitch documentation provides more information about the commands.

$ kubectl logs antrea-agent-ktn22 -n kube-system
error: a container name must be specified for pod antrea-agent-ktn22, \
choose one of: [antrea-agent antrea-ovs] or one of the init containers: [install-cni]


$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-ovs -- bash
root [ / ]# ovs-
ovs-appctl           ovs-dpctl            ovs-ofctl            ovs-pki              ovs-test             ovs-vsctl
ovs-bugtool          ovs-dpctl-top        ovs-parse-backtrace  ovs-tcpdump          ovs-testcontroller   ovs-vswitchd
ovs-docker           ovs-l3ping           ovs-pcap             ovs-tcpundump        ovs-vlan-test


root [ / ]# ovs-vsctl show
83ed275e-084b-4ebb-a19e-f150646bf62b
    Bridge br-int
        datapath_type: system
        Port antrea-gw0
            Interface antrea-gw0
                type: internal
        Port vsphere--940a85
            Interface vsphere--940a85
        Port nginx-de-548af9
            Interface nginx-de-548af9
        Port antrea-tun0
            Interface antrea-tun0
                type: geneve
                options: {csum="true", key=flow, remote_ip=flow}
        Port cassandr-1e6e0a
            Interface cassandr-1e6e0a
        Port nginx-de-481c4a
            Interface nginx-de-481c4a
    ovs_version: "2.13.1"

The ovs-vsctl command output above allows us to interact with the Open vSwitch (OVS) configuration database. Above is a brief overview of the database contents. Of note in the above output is antrea-tun0. When packets from one of the ‘local’ pods on this node are sent to a pod on another node, they are forwarded to the antrea-tun0 port on the ‘local’ OVS. The packets are sent to the destination node through the tunnel (implemented using geneve above). On the destination node, the packets are received via the corresponding antrea-tun0 port and forwarded onto the destination pod.

Also interesting is how packets are routed externally. Packets sent to an external IP on the node’s network are forwarded to the antrea-gw0 port, routed to the appropriate network interface of the Kubernetes node and sent out to the network from there. The Antrea Kubernetes Networking whitepaper has great detail on all of these inner workings – check it out if you want to learn more.

Let’s turn our attention to the other Antrea container next – antrea-agent. This has another really useful utility, antctl. In the output below, I am using it to examine the Pods on this node/agent that are using the OVS.

$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- bash
root [ / ]#
root [ / ]# antctl get

Get the status or resource of a topic

Usage:

  antctl get [command]

Available Commands:
  addressgroup   Print address groups
  agentinfo      Print agent's basic information
  appliedtogroup Print appliedto groups
  networkpolicy  Print NetworkPolicies
  ovsflows       Dump OVS flows
  podinterface   Print Pod's network interface information

Flags:
  -h, --help   help for get

Global Flags:
  -k, --kubeconfig string   absolute path to the kubeconfig file
  -s, --server string       address and port of the API server, taking precedence over the default endpoint and the one set in kubeconfig
  -t, --timeout duration    time limit of the execution of the command
  -v, --verbose             enable verbose output

Use "antctl get [command] --help" for more information about a command.

root [ / ]#


root [ / ]#  antctl get podinterface
NAMESPACE         NAME                             INTERFACE-NAME  IP          MAC               PORT-UUID                            OF-PORT CONTAINER-ID
cassandra         cassandra-1                      cassandr-1e6e0a 192.168.2.6 4e:c8:62:fb:8b:75 9fd9bdc3-5fd1-4c4a-b46a-d4e35af0f8da 7       ac559b874be
default           nginx-deployment-cc7df4f8f-69cm2 nginx-de-481c4a 192.168.2.2 32:a1:45:43:7c:96 bf184406-a53c-4b3d-84c7-a979f435e162 3       f3d429a49c6
default           nginx-deployment-cc7df4f8f-pcn79 nginx-de-548af9 192.168.2.4 f2:38:81:3e:e0:08 54b7032b-f513-4a2b-9bc8-924b2dcb620f 4       52c08818eeb
vmware-system-csi vsphere-csi-node-86fds           vsphere--940a85 192.168.2.3 d2:95:fc:3a:1e:a1 352c776b-1bcc-41d9-a961-50039e804689 5       08a976b1c76

As you can see, lots of good information such as IP address and MAC address. We can also use this command to display OVS flows. OVS flows are used to implement the NetworkPolicies for the “local” Pods. If we use the -h (help) option as shown in the first command, we can obtain a list of all flows. In the second command, I dump the flows for one particular local Pod, so we can observe the type of flows associated with it. I have added some colours to the output so we can easily identify the flows numbers and their descriptions.

root [ / ]#  antctl get ovsflows -h
Dump all the OVS flows or the flows installed for the specified entity.

Usage:

  antctl get ovsflows [flags]

Aliases:
  ovsflows, of

Examples:
  Dump all OVS flows
  $ antctl get ovsflows
  Dump OVS flows of a local Pod
  $ antctl get ovsflows -p pod1 -n ns1
  Dump OVS flows of a NetworkPolicy
  $ antctl get ovsflows --networkpolicy np1 -n ns1
  Dump OVS flows of a flow Table
  $ antctl get ovsflows -T IngressRule


  Antrea OVS Flow Tables:
  0     Classification
  5     Uplink
  10    SpoofGuard
  20    ARPResponder
  29    ServiceHairpin
  30    ConntrackZone
  31    ConntrackState
  40    DNAT(SessionAffinity)
  40    SessionAffinity
  41    ServiceLB
  42    EndpointDNAT
  45    CNPEmergencyEgressRule
  46    CNPSecurityOpsEgressRule
  47    CNPNetworkOpsEgressRule
  48    CNPPlatformEgressRule
  49    CNPApplicationEgressRule
  50    EgressRule
  60    EgressDefaultRule
  70    l3Forwarding
  80    L2Forwarding
  85    CNPEmergencyIngressRule
  86    CNPSecurityOpsIngressRule
  87    CNPNetworkOpsIngressRule
  88    CNPPlatformIngressRule
  89    CNPApplicationIngressRule
  90    IngressRule
  100   IngressDefaultRule
  105   ConntrackCommit
  106   HairpinSNATTable
  110   Output


Flags:
  -h, --help                   help for ovsflows
  -n, --namespace string       Namespace of the entity
      --networkpolicy string   NetworkPolicy name. If present, Namespace must be provided.
  -o, --output string          output format: json|table|yaml (default "table")
  -p, --pod string             Name of a local Pod. If present, Namespace must be provided.
  -T, --table string           Antrea OVS flow table name or number


Global Flags:
  -k, --kubeconfig string   absolute path to the kubeconfig file
  -s, --server string       address and port of the API server, taking precedence over the default endpoint and the one set in kubeconfig
  -t, --timeout duration    time limit of the execution of the command
  -v, --verbose             enable verbose output
root [ / ]#      



root [ / ]# antctl get ovsflows -n default -p nginx-deployment-cc7df4f8f-69cm2

FLOW
table=70, n_packets=3, n_bytes=222, priority=200,ip,dl_dst=aa:bb:cc:dd:ee:ff,nw_dst=192.168.2.2 actions=set_field:46:05:cf:f0:69:b3->eth_src,set_field:32:a1:45:43:7c:96->eth_dst,dec_ttl,goto_table:80
table=0, n_packets=498, n_bytes=34716, priority=190,in_port="nginx-de-481c4a" actions=load:0x2->NXM_NX_REG0[0..15],goto_table:10
table=10, n_packets=5, n_bytes=270, priority=200,ip,in_port="nginx-de-481c4a",dl_src=32:a1:45:43:7c:96,nw_src=192.168.2.2 actions=goto_table:30
table=10, n_packets=5, n_bytes=210, priority=200,arp,in_port="nginx-de-481c4a",arp_spa=192.168.2.2,arp_sha=32:a1:45:43:7c:96 actions=goto_table:20
table=80, n_packets=5, n_bytes=370, priority=200,dl_dst=32:a1:45:43:7c:96 actions=load:0x3->NXM_NX_REG1[],load:0x1->NXM_NX_REG0[16],goto_table:90

root [ / ]#

In the flow tables above, table 70 defines the L3 forwarding and table 80 defines the L2 forwarding. I added some colors them easier to distinguish. Table 10 are spoof-guards. This prevents IP and ARP spoofing from local Pods. To get more details about each of the different flow tables, check out this OVS pipeline document which has detailed information about each table.

Other commands which might be interesting to run in this antrea-agent container are to display the agent information, as well as display information about network policy. We have not yet created a network policy so there is nothing very useful here yet. We will return to this shortly.

root [ / ]# antctl get agentinfo
POD                            NODE                                              STATUS  NODE-SUBNET    NETWORK-POLICIES ADDRESS-GROUPS APPLIED-TO-GROUPS LOCAL-PODS
kube-system/antrea-agent-ktn22 tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r Healthy 192.168.2.0/24 0                0              0                 4
root [ / ]#

OK – let’s quit the container and return to my desktop. Let’s run a describe against one of the Antrea agents so we can observe some similar flow table information, albeit not at the granularity of a Pod as seen previously. I’ve snipped some of the output to make it easier to read. This is the control plane agent.

$ kubectl describe antreaagentinfos tkg-cluster-1-18-5-control-plane-bsn9l
Name:         tkg-cluster-1-18-5-control-plane-bsn9l
Namespace:
Labels:       <none>
Annotations:  <none>
Agent Conditions:
  Last Heartbeat Time:  2020-11-12T17:35:35Z
  Status:               True
  Type:                 AgentHealthy
  
  <--snip-->

Kind:                   AntreaAgentInfo
Local Pod Num:          5
Metadata:
  Creation Timestamp:  2020-10-23T08:25:23Z
  Generation:          29285
  Managed Fields:
    API Version:  clusterinformation.antrea.tanzu.vmware.com/v1beta1
 
 <--snip-->

    Manager:         antrea-agent
    Operation:       Update
    Time:            2020-11-12T17:35:35Z
  Resource Version:  7476901
  Self Link:         /apis/clusterinformation.antrea.tanzu.vmware.com/v1beta1/antreaagentinfos/tkg-cluster-1-18-5-control-plane-bsn9l
  UID:               8e1bd7c7-0c3a-4f00-a405-3ef2f0b1d8f1
Network Policy Controller Info:
Node Ref:
  Kind:  Node
  Name:  tkg-cluster-1-18-5-control-plane-bsn9l
Node Subnet:
  192.168.0.0/24
Ovs Info:
  Bridge Name:  br-int
  Flow Table:
    0:      8
    10:     13
    100:    1
    105:    3
    110:    2
    20:     4
    30:     1
    31:     4
    40:     2
    50:     2
    60:     1
    70:     9
    80:     7
    90:     3
  Version:  2.13.1
Pod Ref:
  Kind:       Pod
  Name:       antrea-agent-l7gzv
  Namespace:  kube-system
Version:      v0.9.2-unknown
Events:       <none>

Again, we see the list of OVF flow tables defined by Antrea, such as spoof-guard tables (10) , as well as L2 (80) and L3 (70) forwarding tables, among others. We also see the number of packet-processing flows in each table. Let’s now focus our attention on the other features of Antrea, namely NetworkPolicies. There is a considerable amount of information on how to configure Network Policies available here (note that this is for Antrea native policies, which are current not available in TKG guest clusters, but still has useful information for reference purposes). For this part of the post, I am going to leverage a great example provided by Curtis Collicutt. Kudos Curtis!

As already mentioned, and shown throughout the post, I have a simple nginx deployment in my default namespace. I am now going to deploy a simple busybox Pod in the same namespace and show how access between Pods is currently wide open, implying this newly deployed Pod has direct access to the nginx web server via its ClusterIP address.

$ kubectl get svc nginx-svc
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
nginx-svc   LoadBalancer   10.108.137.209   10.27.62.18   443:30472/TCP,80:30915/TCP   20d


$ kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE   IP            
nginx-deployment-cc7df4f8f-69cm2   1/1     Running   2          20d   
nginx-deployment-cc7df4f8f-ms5jm   1/1     Running   2          20d   
nginx-deployment-cc7df4f8f-pcn79   1/1     Running   2          20d


$ kubectl apply -f busybox-cor.yaml
pod/ch-busybox created


$ kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
ch-busybox                         1/1     Running   0          3m13s
nginx-deployment-cc7df4f8f-69cm2   1/1     Running   2          20d
nginx-deployment-cc7df4f8f-ms5jm   1/1     Running   2          20d
nginx-deployment-cc7df4f8f-pcn79   1/1     Running   2          20d


$ kubectl exec -it ch-busybox -- sh
/ # wget -q -O - 10.108.137.209 | grep title
<title>Welcome to nginx!</title>
/ # exit
$

At this point in time, any Pods running in the cluster have access to all other Pods. This is, most likely, not desirable. The next objective is to create a NetworkPolicy with an ingress rule which prevents the busybox Pod from accessing the nginx web server. The network policy we create using the manifest below will only allow Pods with a matching label (nginx) to have ingress access to the nginx Pods.

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: nginx-allow
spec:
  podSelector:
    matchLabels:
      app: nginx
  ingress:
  - from:
      - podSelector:
          matchLabels:
            app: nginx

Obviously this is a very simple policy with a single ingress rule. You can imagine a much more detailed policy with multiple ingress and egress rules, controlling whether different types of traffic should be allowed allowed or dropped between Pods in the namespace.

Let’s now apply this network policy and see if the busybox Pod can still access the nginx web server.

$ kubectl apply -f antrea-network-policy.yaml
networkpolicy.networking.k8s.io/nginx-allow created


$ kubectl get networkpolicy
NAME          POD-SELECTOR   AGE
nginx-allow   app=nginx      8s


$ kubectl exec -it ch-busybox -- sh
/ # wget -q -O - 10.108.137.209 | grep title
wget: can't connect to remote host (10.108.137.209): Connection timed out
/ #

As we can see, the busybox Pod can no longer access the nginx web service. Pretty cool, huh?

Now, just to prove a point, we can edit the busybox manifest to add a label (app=nginx) which allow it to communicate to the web server once again, since the ingress network policy should allow any Pods with this label to access the web server. The first command below shows that we have added the label. Then we connect to the Pod once more and retry the wget command.

$ kubectl get pod ch-busybox -L app
NAME         READY   STATUS    RESTARTS   AGE   APP
ch-busybox   1/1     Running   0          16m   nginx

$ kubectl exec -it ch-busybox -- sh
/ # wget -q -O - 10.108.137.209 | grep title
<title>Welcome to nginx!</title>
/ #

Nice! And now if we use the antctl command seen earlier, we can look at the new Ingress rules (Flow Table 90) in the Antrea agent container.

$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- antctl get ovsflows -T ingressRule
FLOW
table=90, n_packets=11666344, n_bytes=1394912954, priority=210,ct_state=-new+est,ip actions=goto_table:105
table=90, n_packets=7, n_bytes=518, priority=210,ip,nw_src=192.168.2.1 actions=goto_table:105
table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.2 actions=conjunction(1,1/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.4 actions=conjunction(1,1/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.1.2 actions=conjunction(1,1/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,nw_src=192.168.2.7 actions=conjunction(1,1/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x3 actions=conjunction(1,2/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x4 actions=conjunction(1,2/2)
table=90, n_packets=0, n_bytes=0, priority=200,ip,reg1=0x8 actions=conjunction(1,2/2)
table=90, n_packets=1, n_bytes=74, priority=190,conj_id=1,ip actions=load:0x1->NXM_NX_REG6[],goto_table:105
table=90, n_packets=586, n_bytes=43364, priority=0 actions=goto_table:100

The network sources highlighted above, with ingress rules, can be tied directly to the IP addresses of the Pods.

$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP            NODE                                                NOMINATED NODE   READINESS GATES
ch-busybox                         1/1     Running   0          24m   192.168.2.7   tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r   <none>           <none>
nginx-deployment-cc7df4f8f-69cm2   1/1     Running   2          20d   192.168.2.2   tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r   <none>           <none>
nginx-deployment-cc7df4f8f-ms5jm   1/1     Running   2          20d   192.168.1.2   tkg-cluster-1-18-5-workers-8zj27-67c98696dc-kfd5z   <none>           <none>
nginx-deployment-cc7df4f8f-pcn79   1/1     Running   2          20d   192.168.2.4   tkg-cluster-1-18-5-workers-8zj27-67c98696dc-52g5r   <none>           <none>

Let’s try a slightly different approach this time. Let’s give the busybox Pod a different label (busybox), and add this as another entry to the Network Policy manifest to allow Pods with another label to communicate to one another.

$ kubectl get pod ch-busybox -L app
NAME         READY   STATUS    RESTARTS   AGE     APP
ch-busybox   1/1     Running   70         2d22h   busybox


$ kubectl exec -it ch-busybox -- sh
/ # wget -q -O - 10.108.137.209 | grep title
wget: can't connect to remote host (10.108.137.209): Connection timed out
/ #

As before, only Pods with an app=nginx have an ingress rule to communicate to one another. Let’s add another rule to the Network Policy that also allows ingress for app=busybox.

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: nginx-allow
spec:
  podSelector:
    matchLabels:
      app: nginx
  ingress:
  - from:
      - podSelector:
          matchLabels:
            app: nginx
      - podSelector:
          matchLabels:
            app: busybox

Let’s apply the changes and repeat our test.

$ kubectl apply -f antrea-network-policy.yaml
networkpolicy.networking.k8s.io/nginx-allow configured

$ kubectl exec -it ch-busybox -- sh
/ # wget -q -O - 10.108.137.209 | grep title
<title>Welcome to nginx!</title>
/ #

And the ingress for the Pod with a busybox label is now working. Nice! One final item to show is that the network policy can also be queried via antctl on the Antrea agent container in the Antrea agent Pod.

$ kubectl exec -it antrea-agent-ktn22 -n kube-system -c antrea-agent -- antctl get networkpolicy
NAMESPACE NAME        APPLIED-TO                           RULES
default   nginx-allow 766a9e51-f132-5c2f-b862-9ac68e75d77d 1

Hopefully that has given you a decent idea on how simple but powerful the network policies are in the new Antrea CNI in vSphere with Tanzu TKG Guest clusters. I’ve only just started to experiment, and I’m aware that there are a number of new additional innovations coming down the line. No doubt, I’ll be revisiting Antrea to share more goodness with you about that at some point.

Last, I’d like to extend a thanks to Antonin Bas from our Antrea team for providing much of the insight shared here. Merci beaucoup Antonin.