If you’ve been following along my recent blog posts, you’ll have seen that I have been spending some time ramping up on NSX-T and Pivotal Container Services (PKS). My long term goal was to see how these two products integrate together and to figure out the various moving parts. As I was very unfamiliar with both products, I took a piece-meal approach to both. First, I tried to get some familiarity with NSX-T. You can find my previous posts on NSX-T here:
- Building a simple ESXi host overlay network with NSX-T
- First steps with NSX-T Edge – DHCP Server
- Next steps with NSX-T – Routing and BGP
During this time, I also tried to familiarize myself with PKS by initially deploying it out on a simple flat network, and deploying my first Kubernetes cluster. You can read about how I did that here:
So now it is time to see if I can get them both working together, by deploying PKS and a Kubernetes cluster, and have NSX-T provide the necessary networking pieces.
I’m not going to go through everything from scratch. Based on my previous configurations, what I am describing below are the additional steps needed when you wish to integrate PKS with NSX-T. If everything works out well, my PKS + NSX-T deployment should look something like the following:
As we go through the configuration steps, I will explain the pieces that need to be pre-configured, and then you will see the components that are automatically instantiated in NSX-T by the PKS integration. Suffice to say that the NSX Controllers, Manager and Edge need to be deployed in advance, as well as the BOSH/PKS CLI and the Pivotal Ops Manager. The Ops Manager will then be used to deploy the BOSH and PKS VMs.
1. NSX-T Additional Requirements
If you need a starting point, use my previous NSX-T posts to get you going. The following are the list of additional items that you will need to configure in NSX-T to use it with PKS.
1.1. First, you will need an IP address management (IPAM) IP Block that will be used by the Kubernetes namespaces on demand. In the NSX-T Manager, navigate to IPAM, then do a +ADD to create your IP Block. I used 172.16.0.0/16 to provide plenty /24 address ranges for my Kubernetes PODs. Here is what my IP Block looks like.
1.2. Now we need an IP Pool for Load Balancers. These LBs are created by NSX-T and provide access to the K8s namespaces. There will be one created for every K8s namespace. This is done automatically as well. Navigate to Inventory > Groups > IP Pools and add yours. These IP addresses need have a route out to speak to the rest of the infrastructure. In my case, I have used a range of IP that are on the same network as the range of IPs that will be used for the Kubernetes Cluster in part 2 (the Service Network). Be careful to keep unique ranges for both the IP Pool and Service Network, and don’t have any overlap of IPs if you choose this approach.
1.3. Next, change the T0 Logical Router Route Advertisement. Previously, I was only advertising NSX connected routes. Now I need to add All NAT Routes and All LB VIP Routes. To modify this configuration, select your T1 Logical Router, then Routing, then Route Advertisement. This is what my configuration now looks like:
That completes all of the settings needed in NSX-T. I don’t have to do anything else with T0 Static Routers or Route Distribution, or anything else. Let’s now see what changes we need to make in BOSH and PKS.
2. BOSH and PKS additional requirements
2.1 Again, I’m not going to describe everything here. Check out my previous PKS post to get you started. I am going to start with the necessary BOSH deployment changes in the Pivotal Ops Manager to integrate it with NSX-T. When I did my original PKS deployment on a flat network, I put the Management and Service networks on the same VLAN/flat network. Now I want my service network to use NSX-T, so I will need to change the network configuration in BOSH so that my second network is now using an NSX-T network. I will basically use the 191 network described in my Routing and BGP post. This network is essentially an NSX-T logical switch which is routed externally, and whose associated Logical Switch is visible as a port group in vSphere. After making the changes, this is what my network configuration looks like from BOSH.
One thing to note is the reserved IP range. As you just read in part 1, I also used this network for the Load Balancing IP pool in NSX-T. Make sure you do not overlap these ranges if you are using the same segment for both purpose. Reserved here means that BOSH won’t use them.
2.2 Now let’s turn our attention to the PKS configuration. First, select the Assign AZs and Networks, and tell PKS to use Logical Switch “191” network as its service network, i.e. the network on which the K8s master and workers are to be deployed on.
2.3. We also need to change the Networking configuration. When you are populating this form, you will need to copy and paste a number of Ids from NSX-T, namely the T0 Logical Router ID, the ID of the load-balancer IP Pool and the ID of the IPAM IP Block. We shall see shortly when we deploy our first K8s cluster, how these all get tagged within NSX-T by PKS. I’ve also chosen to disable SSL certificate verification.
2.4 The final step in PKS is to turn on NSX-T Validation errand under Errands on PKS. By default, this is off. If you do not turn this on, you will not see the necessary NSX-T components such as load-balancers and virtual servers, T1 logical routers or tags being created. So make sure you turn this on, as there is no check to make sure you did it, and K8s cluster deployments will simply fail.
That completes all the necessary steps in Pivotal Ops Manager. We are now ready to deploy our K8s cluster with NSX-T networking. Login to your PKS CLI VM, run the necessary “uaa” and “pks” commands, and create your first K8s cluster. Again, refer back to my previous posts on how to setup and use the correct CLI commands if you haven’t already done so. Here I am using the “bosh task” command to track the deployment. The first time around, there is a lot of activity as many of the required NSX-T components need to be compiled. So you will see VMs getting created to take care of this task, and then deleted before the K8s master and worker VMs are created. Finally, I add a local /etc/host entry the external K8s cluster hostname to match the master IP. This will allow us to run “kubectl” commands later. Alternatively, you could have added this to your DNS.
root@pks-cli:~# pks create-cluster k8s-cluster-01 --external-hostname pks-cluster-01 --plan small --num-nodes 3 Name: k8s-cluster-01 Plan Name: small UUID: 2ff760bc-bcf7-4ed6-8fc9-7fcadc4797c9 Last Action: CREATE Last Action State: in progress Last Action Description: Creating cluster Kubernetes Master Host: pks-cluster-01 Kubernetes Master Port: 8443 Worker Instances: 3 Kubernetes Master IP(s): In Progress root@pks-cli:~# bosh task Using environment '220.127.116.11' as client 'ops_manager' Task 22 Task 22 | 12:42:36 | Preparing deployment: Preparing deployment (00:00:05) Task 22 | 12:42:52 | Preparing package compilation: Finding packages to compile (00:00:00) Task 22 | 12:42:52 | Compiling packages: nsx-ncp/2e97940ecfef6248e47df6d33f33401201951c39 Task 22 | 12:42:52 | Compiling packages: nsx-cni/02e4a54d92110d142484280ea0155aa6a62d66c6 Task 22 | 12:42:52 | Compiling packages: python_nsx/1dc4a2a093da236d60bc50c4dc00c2465c91f40c Task 22 | 12:43:41 | Compiling packages: nsx-cni/02e4a54d92110d142484280ea0155aa6a62d66c6 (00:00:49) Task 22 | 12:44:21 | Compiling packages: nsx-ncp/2e97940ecfef6248e47df6d33f33401201951c39 (00:01:29) Task 22 | 12:46:03 | Compiling packages: python_nsx/1dc4a2a093da236d60bc50c4dc00c2465c91f40c (00:03:11) Task 22 | 12:46:03 | Compiling packages: openvswitch/7fdc416abb4b5c40051d365181b4f187ee5c6c6b (00:03:30) Task 22 | 12:49:56 | Creating missing vms: worker/b4b10717-3ef6-4d37-b30f-de41740e8e0d (0) Task 22 | 12:49:56 | Creating missing vms: master/7558c290-a7e9-484a-970b-19fd31b68f04 (0) Task 22 | 12:49:56 | Creating missing vms: worker/3c2de0bc-4714-4027-8351-4b3a0a87f59a (2) Task 22 | 12:49:56 | Creating missing vms: worker/92464c03-74f6-43c2-9170-9c86675c7ccb (1) (00:00:59) Task 22 | 12:50:56 | Creating missing vms: master/7558c290-a7e9-484a-970b-19fd31b68f04 (0) (00:01:00) Task 22 | 12:51:00 | Creating missing vms: worker/3c2de0bc-4714-4027-8351-4b3a0a87f59a (2) (00:01:04) Task 22 | 12:51:07 | Creating missing vms: worker/b4b10717-3ef6-4d37-b30f-de41740e8e0d (0) (00:01:11) Task 22 | 12:51:07 | Updating instance master: master/7558c290-a7e9-484a-970b-19fd31b68f04 (0) (canary) (00:01:14) Task 22 | 12:52:21 | Updating instance worker: worker/b4b10717-3ef6-4d37-b30f-de41740e8e0d (0) (canary) (00:02:09) Task 22 | 12:54:30 | Updating instance worker: worker/3c2de0bc-4714-4027-8351-4b3a0a87f59a (2) (00:01:57) Task 22 | 12:56:27 | Updating instance worker: worker/92464c03-74f6-43c2-9170-9c86675c7ccb (1) (00:01:54) Task 22 Started Mon May 14 12:42:36 UTC 2018 Task 22 Finished Mon May 14 12:58:21 UTC 2018 Task 22 Duration 00:15:45 Task 22 done Succeeded root@pks-cli:~# pks cluster k8s-cluster-01 Name: k8s-cluster-01 Plan Name: small UUID: 2ff760bc-bcf7-4ed6-8fc9-7fcadc4797c9 Last Action: CREATE Last Action State: in progress Last Action Description: Instance provisioning in progress Kubernetes Master Host: pks-cluster-01 Kubernetes Master Port: 8443 Worker Instances: 3 Kubernetes Master IP(s): In Progress root@pks-cli:~# pks cluster k8s-cluster-01 Name: k8s-cluster-01 Plan Name: small UUID: 2ff760bc-bcf7-4ed6-8fc9-7fcadc4797c9 Last Action: CREATE Last Action State: succeeded Last Action Description: Instance provisioning completed Kubernetes Master Host: pks-cluster-01 Kubernetes Master Port: 8443 Worker Instances: 3 Kubernetes Master IP(s): 192.168.191.201 root@pks-cli:~# vi /etc/hosts root@pks-cli:~# grep 192.168.191.201 /etc/hosts 192.168.191.201 pks-cluster-01 root@pks-cli:~# pks get-credentials k8s-cluster-01 Fetching credentials for cluster k8s-cluster-01. Context set for cluster k8s-cluster-01. You can now switch between clusters by using: $kubectl config use-context <cluster-name> root@pks-cli:~# kubectl config use-context k8s-cluster-01 Switched to context "k8s-cluster-01". root@pks-cli:~# kubectl get nodes NAME STATUS ROLES AGE VERSION 6d8f1eaf-90cd-4569-aeff-f0e522c177a5 Ready <none> 10m v1.9.6 a33c52be-23dd-4bc9-bcf0-119a82cef84b Ready <none> 8m v1.9.6 ae69009f-9a40-42dc-9815-2fe453373aee Ready <none> 12m v1.9.6
Success! Our K8s cluster has deployed. Now let’s take a look at what was build inside in NSX-T to accommodate this.
3. NSX-T Tagged Components
When we filled in the NSX-T information in the PKS network section in part 2.3 above, we added IDs for the T0 Logical Router, the IP Block and the load balancer IP pool. These all get tagged now by PKS in NSX-T. Let’s look at the tags first of all.
3.1 The T0-Logical Router has a single tag associated with it. The tag is ncp/shared_resource. NCP is the NSX-T Container Plug-in for Kubernetes.
3.2 The next item that has a tag is the IPAM IP Block. The tag is the same as the T0 Logical Router, ncp/shared_resource.
3.3 The final component that is tagged is the load-blancer IP pool. This has two tags, ncp/shared_resource and ncp/external.
What is important to remember is that if you decide to reinstall PKS, or change the IP Block and the LB pool, you will need to manually remove these tags from the existing components in NSX-T, or things may get very confused as there may be multiple components tagged and it will not know which one to use.
4. NSX-T Automatically Instantiated Components
Now we will take a look at the set of components that are automatically instantiated when PKS is integrated with NSX-T. We’ve already mentioned a number of these, so lets take a closer look.
4.1 First of all, there is an NSX-T Load Balancer and associated Virtual Servers. You will find this under Load Balancing > Load Balancers in the NSX-T Manager:
And this Load Balancer is backed by two Virtual Servers, one for http (port 80) and the other for https (port 443), which can be seen when you select the Virtual Servers link. This is what mine look like. Note that this also need an IP address from the load balancer IP Pool created in step 1.2.
4.2 The next thing we observe are a set of logical switches created for each of the Kubernetes namespaces. We see one for a load balancer, and the other 4 are for the 4 K8s namespaces (default, kube-public, kube-system and pks-infrastructure).
Just FYI, to compare this to the list all the namespaces, you can use the following kubectl command. Note some namespaces (default, kube-public) don’t have any PODs.
root@pks-cli:~# kubectl get ns NAME STATUS AGE default Active 20h kube-public Active 20h kube-system Active 20h pks-infrastructure Active 20h root@pks-cli:~# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system heapster-586c6bcbff-dq4q7 1/1 Running 0 3h kube-system kube-dns-5c996f55c8-l9fkw 3/3 Running 0 3h kube-system kubernetes-dashboard-55d97799b5-8fmd4 1/1 Running 0 3h kube-system monitoring-influxdb-744b677649-kkgmq 1/1 Running 0 3h pks-infrastructure nsx-ncp-79bbd9fc44-v99fp 1/1 Running 0 3h pks-infrastructure nsx-node-agent-8frbb 2/2 Running 0 3h pks-infrastructure nsx-node-agent-96p28 2/2 Running 1 3h pks-infrastructure nsx-node-agent-hsfdk 2/2 Running 0 3h root@pks-cli:~#
4.3 All of the logical switches are connected to the T0 Logical Switch by a set of T1 Logical Routers.
4.4 And of course, for these to reach the outside, they are linked to the T0 Logical Router via a set of router ports.
4.5 Last but not least, remember that the PODs have been assigned various addresses from the IPAM IP Block, and these are in the 172.16.0.0/16 range. These are not given direct access to the outside, but instead are SNAT’ed to our Load Balancer IP Pool. This is implemented on the T0 Logical Router. If we look at the NAT Rules on the T0 Logical Router, we see that this is also taken care of:
There we can see the different namespace/POD ranges each having a SNAT rule to map their internal IPs to the external load balancer assigned range. And this is all done automatically for you. Pretty neat!
Hopefully this has helped show you the power of integrating NSX-T with PKS. While there is a lot of initial setup to get right, the ease of rolling out multiple Kubernetes clusters with unique networking is greatly simplified by NSX-T.
Again this wouldn’t have been possible without guidance from a number of folks. Kudos again to Keith Lee of DELL-EMC (follow Keith on twitter for some upcoming blogs on this), and also Francis Guillier (Technical Product Manager) and Gaetano Borgione (PKS Architect) from our VMware Cloud Native Apps team. I’d also recommend checking out William Lam’s blog series on this, as well as Sam McGeown’s NSX-T 2.0 deployment series of blogs, both of which I relied on heavily. Thanks all!