Configuring Tanzu Kubernetes with a Proxy (Squid)
In this post, I am going to show how I set up my Tanzu Kubernetes Grid management cluster using a proxy configuration. I suspect this may be something many readers might want to try at some point, for various reasons. I will add a caveat to say that I have done the bare minimum to get this configuration to work, so you will probably want to spend far more time than I did on tweaking and tuning the proxy configuration. At the end of the day, the purpose of this exercise is to show how a TKG bootstrap virtual machine (running Ubuntu) can access the internet via the proxy to get OS updates, install docker, pull down docker images, get tanzu plugins and finally build a TKG management cluster. This will involve building two VMs, one to act as the proxy server and the other to act as the bootstrap environment where I can begin to build TKG clusters. Let’s look at the proxy server first.
Step 1. Setup the Proxy Server (Squid)
I created a dual NIC VM, one connection to my internal VLAN and the other with external connectivity. I then installed Ubuntu 20.04 and then followed the steps outlined in the Ubuntu docs for Proxy Servers – Squid. Once the proxy server was running, I wanted to give external access to all IP addresses on my internal VLAN with the 10.35.13.0/24 range. This will be the range where my TKG cluster VMs will be deployed. The following is the /etc/squid/squid.conf file I created. There are a lot of comments in the configuration file, so I used this useful grep command to only display non-commented lines. The 3 main changes are highlighted in blue below.
$ grep -vE '^$|^#' /etc/squid/squid.conf acl localnet src 0.0.0.1-0.255.255.255 # RFC 1122 "this" network (LAN) acl localnet src 10.0.0.0/8 # RFC 1918 local private network (LAN) acl localnet src 100.64.0.0/10 # RFC 6598 shared address space (CGN) acl localnet src 169.254.0.0/16 # RFC 3927 link-local (directly plugged) machines acl localnet src 172.16.0.0/12 # RFC 1918 local private network (LAN) acl localnet src 192.168.0.0/16 # RFC 1918 local private network (LAN) acl localnet src fc00::/7 # RFC 4193 local private network range acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines acl vlan_3513 src 10.35.13.0/24. # Cormac's internal network acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT http_access allow vlan_3513 http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost manager include /etc/squid/conf.d/* http_access allow localnet http_access allow localhost http_access deny all http_port 10.35.13.136:3128 cache_dir ufs /var/spool/squid 100 16 256 coredump_dir /var/spool/squid refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern \/(Packages|Sources)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims refresh_pattern \/Release(|\.gpg)$ 0 0% 0 refresh-ims refresh_pattern \/InRelease$ 0 0% 0 refresh-ims refresh_pattern \/(Translation-.*)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims refresh_pattern . 0 20% 4320 via on
Thus, my proxy server is now http://10.35.13.136:3128. Another useful tool is the parsing option available with squid. This will make sure that the entries in the configuration file are correctly formatted.
$ sudo squid -k parse 2021/10/27 15:27:20| Startup: Initializing Authentication Schemes ... 2021/10/27 15:27:20| Startup: Initialized Authentication Scheme 'basic' 2021/10/27 15:27:20| Startup: Initialized Authentication Scheme 'digest' 2021/10/27 15:27:20| Startup: Initialized Authentication Scheme 'negotiate' 2021/10/27 15:27:20| Startup: Initialized Authentication Scheme 'ntlm' 2021/10/27 15:27:20| Startup: Initialized Authentication. 2021/10/27 15:27:20| Processing Configuration File: /etc/squid/squid.conf (depth 0) 2021/10/27 15:27:20| Processing: acl localnet src 0.0.0.1-0.255.255.255 # RFC 1122 "this" network (LAN) 2021/10/27 15:27:20| Processing: acl localnet src 10.0.0.0/8 # RFC 1918 local private network (LAN) 2021/10/27 15:27:20| Processing: acl localnet src 100.64.0.0/10 # RFC 6598 shared address space (CGN) 2021/10/27 15:27:20| Processing: acl localnet src 169.254.0.0/16 # RFC 3927 link-local (directly plugged) machines 2021/10/27 15:27:20| Processing: acl localnet src 172.16.0.0/12 # RFC 1918 local private network (LAN) 2021/10/27 15:27:20| Processing: acl localnet src 192.168.0.0/16 # RFC 1918 local private network (LAN) 2021/10/27 15:27:20| Processing: acl localnet src fc00::/7 # RFC 4193 local private network range 2021/10/27 15:27:20| Processing: acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines 2021/10/27 15:27:20| Processing: acl vlan_3513 src 10.35.13.0/24 # Cormac's internal network 2021/10/27 15:27:20| Processing: acl SSL_ports port 443 2021/10/27 15:27:20| Processing: acl Safe_ports port 80 # http 2021/10/27 15:27:20| Processing: acl Safe_ports port 21 # ftp 2021/10/27 15:27:20| Processing: acl Safe_ports port 443 # https 2021/10/27 15:27:20| Processing: acl Safe_ports port 70 # gopher 2021/10/27 15:27:20| Processing: acl Safe_ports port 210 # wais 2021/10/27 15:27:20| Processing: acl Safe_ports port 1025-65535 # unregistered ports 2021/10/27 15:27:20| Processing: acl Safe_ports port 280 # http-mgmt 2021/10/27 15:27:20| Processing: acl Safe_ports port 488 # gss-http 2021/10/27 15:27:20| Processing: acl Safe_ports port 591 # filemaker 2021/10/27 15:27:20| Processing: acl Safe_ports port 777 # multiling http 2021/10/27 15:27:20| Processing: acl CONNECT method CONNECT 2021/10/27 15:27:20| Processing: http_access allow vlan_3513 2021/10/27 15:27:20| Processing: http_access deny !Safe_ports 2021/10/27 15:27:20| Processing: http_access deny CONNECT !SSL_ports 2021/10/27 15:27:20| Processing: http_access allow localhost manager 2021/10/27 15:27:20| Processing: include /etc/squid/conf.d/* 2021/10/27 15:27:20| Processing Configuration File: /etc/squid/conf.d/debian.conf (depth 1) 2021/10/27 15:27:20| Processing: logfile_rotate 0 2021/10/27 15:27:20| Processing: http_access allow localnet 2021/10/27 15:27:20| Processing: http_access allow localhost 2021/10/27 15:27:20| Processing: http_access deny all 2021/10/27 15:27:20| Processing: http_port 10.35.13.136:3128 2021/10/27 15:27:20| Processing: cache_dir ufs /var/spool/squid 100 16 256 2021/10/27 15:27:20| Processing: coredump_dir /var/spool/squid 2021/10/27 15:27:20| Processing: refresh_pattern ^ftp: 1440 20% 10080 2021/10/27 15:27:20| Processing: refresh_pattern ^gopher: 1440 0% 1440 2021/10/27 15:27:20| Processing: refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 2021/10/27 15:27:20| Processing: refresh_pattern \/(Packages|Sources)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims 2021/10/27 15:27:20| Processing: refresh_pattern \/Release(|\.gpg)$ 0 0% 0 refresh-ims 2021/10/27 15:27:20| Processing: refresh_pattern \/InRelease$ 0 0% 0 refresh-ims 2021/10/27 15:27:20| Processing: refresh_pattern \/(Translation-.*)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims 2021/10/27 15:27:20| Processing: refresh_pattern . 0 20% 4320 2021/10/27 15:27:20| Processing: via on 2021/10/27 15:27:20| Initializing https:// proxy context
Once it has successfully processed without errors, we can proceed with building our second VM which will access the internet via the proxy server. Like I said in the introduction, I have done the bare minimum proxy configuration here to get this working, so you may want to spend some more time and research on additional security steps.
Step 2. Setup the proxy client VM / tanzu bootstrap node
I have broken this step up into a a number of parts as there are a number of items to consider.
2.1 Ubuntu and Docker
Before we do anything with TKG and tanzu, we must first setup this virtual machine / guest OS to function via the proxy. Again, I have installed Ubuntu 20.04. The next step is to install docker. I used the official docker guide to installing docker on Ubuntu using the repository. Note however that this makes extensive use of apt calls, as well as curl. Both of these need to be told how to use the proxy. To enable apt to access the internet via a proxy, simply create the file /etc/apt/apt.conf.d/proxy.conf and add the following lines for both http and https (changing the settings to your proxy server and port of course):
Acquire::http::Proxy "http://10.35.13.136:3128"; Acquire::https::Proxy "http://10.35.13.136:3128";
The next time you run an apt command, it should use this proxy configuration. For curl, you simply need to ensure that you include a -x or –proxy with the curl command and also provide the [protocol]://[proxy-server]:[proxy port], e.g. http://10.35.13.136:3128 parameter. This will now allow you to install docker on the VM.
The final step to make sure docker is functioning is to tell the docker daemon about the proxy server so that it directs it’s registry pull requests via the proxy. Again, the official docker documentation shows how to create a proxy configuration. In a nutshell, you must create the file /etc/systemd/system/docker.service.d/http-proxy.conf and add the following entries (again, changing the settings to your proxy server and port of course).
[Service] Environment="HTTP_PROXY=http://10.35.13.136:3128" Environment="HTTPS_PROXY=http://10.35.13.136:3128"
Reload and restart docker, then try a simple docker test, such as docker run hello-world. If the image is successfully fetched from the docker registry, you should be good to proceed to the next step.
2.2 tanzu CLI setup
You should be able to download and install the tanzu CLI as per the official tanzu documentation. However, there is a caveat around the tanzu plugin list command. This command attempts to pull a manifest from an external repository as per the following failure:
$ tanzu plugin list Error: could not fetch manifest from repository "core": Get "https://storage.googleapis.com/tanzu-cli/artifacts/manifest.yaml": dial tcp 142.250.189.176:443: i/o timeout ✖ could not fetch manifest from repository "core": Get "https://storage.googleapis.com/tanzu-cli/artifacts/manifest.yaml": dial tcp 142.250.189.176:443: i/o timeout
This is failing as it is not sending the request through the proxy server, but instead it is trying to reach it directly from the internal network. To my knowledge, there is no way to specify a proxy on the tanzu command line (I may be mistaken here, but I was unable to find a way), so in order to address this, you need to set some proxy environment variables in your shell. This can be done a number of ways. You could add the proxies to the global network configuration of the OS which will then automatically add the environment variables to your shell. Or alternatively, set them in your profile. I added them to my ~/.bash_profile as follows:
export HTTP_PROXY=http://10.35.13.136:3128/ export HTTPS_PROXY=http://10.35.13.136:3128/
In either case, they appear as environment variables in your shell, and I also believe they work with both upper-case and lower-case names. Now the tanzu plugin list command works as expected:
$ tanzu plugin list NAME LATEST VERSION DESCRIPTION REPOSITORY VERSION STATUS alpha v1.3.1 Alpha CLI commands core not installed cluster v1.3.1 Kubernetes cluster operations core v1.3.1 installed kubernetes-release v1.3.1 Kubernetes release operations core v1.3.1 installed login v1.3.1 Login to the platform core v1.3.1 installed management-cluster v1.3.1 Kubernetes management cluster operations core v1.3.1 installed pinniped-auth v1.3.1 Pinniped authentication operations (usually not directly invoked) core v1.3.1 installed
We can now proceed with the creation of the TKG management cluster.
2.3 TKG management cluster deployment – docker requirements
Before creating a management cluster, add your user to the docker group, or else the tanzu management-cluster create command will complain that the docker daemon is not running. A sudo will not even help. This is the error reported.
$ tanzu management-cluster create -u Validating the pre-requisites... Error: docker prerequisites validation failed: Docker daemon is not running, Please make sure Docker daemon is up and running
You can use the usermod command to add your user (in this case cormac) to the docker group.
$ sudo usermod -aG docker cormac [sudo] password for cormac:******
Now the pre-requisites will pass:
$ tanzu management-cluster create -u Validating the pre-requisites... Serving kickstart UI at http://127.0.0.1:8080
2.4 TKG Management Cluster – vCenter server considerations
Now we get to the part where I spun my wheels the most. When I connect to my vCenter server from a browser on my bootstrap VM through the proxy, I get the following pop-up:
And this is fine since it is my lab. I haven’t signed my vCenter certificates. I can just go ahead and accept the risk. However, this raises another issue for the TKG UI. It also reports that it has found a certificate signed by an unknown authority. There is, however, no way to tell TKG to accept the risk and continue.
Now, there may be some ways of allowing this to work via the proxy configuration. I thought the ssl-bump feature in Squid might allow this to work, but there seems to be some issues with using this feature on Ubuntu. In the end, I decide that the easiest thing to do would be to create yet another environment variable, NO_PROXY, and add the vCenter server domain to it. NO_PROXY is essentially a white list of connections that do not go through the proxy. After adding eng.vmware.com to the NO_PROXY settings, the TKG UI was able to proceed with the connection to vCenter.
2.5 TKG Management Cluster – Proxy Settings
Later on in the UI, you are prompted for proxy details for the TKG management cluster itself. You can add the vCenter domain here. Do not place any wildcards in the NO_PROXY field, such as an “*”. Even though the TKG UI will accept *.eng.vmware.com, it will fail later on parsing it as it expects alphabetic or numeric characters only. Here is an example of the Proxy Settings from my deployment:
Note that the NO_PROXY field prompts you to add additional entries to the NO_PROXY list, such as the Pod CIDR, Service CIDR and others. This is so that internal TKG / Kubernetes cluster communication does not use the proxy, e.g. for logging. Entries similar to the following should now appear in the TKG management cluster configuration file:
TKG_HTTP_PROXY: http://10.35.13.136:3128 TKG_HTTP_PROXY_ENABLED: "true" TKG_HTTPS_PROXY: http://10.35.13.136:3128 TKG_NO_PROXY: eng.vmware.com,127.0.0.0/8,::1,svc,svc.cluster.local,100.64.0.0/16,100.96.0.0/16
You should now have everything in place to successfully deploy a TKG management cluster via a proxy.
Step 3. Deploy TKG Management Cluster via Proxy
The TKG management cluster deployment via a proxy now appears much the same as a standard deployment, except that the container images are pulled via the proxy rather than directly from the internet.
$ tanzu management-cluster get NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES mgmt-proxy tkg-system creating 0/1 1/1 v1.20.5+vmware.1 management Details: NAME READY SEVERITY REASON SINCE MESSAGE /mgmt-proxy False Info WaitingForControlPlane 18s ├─ClusterInfrastructure - VSphereCluster/mgmt-proxy True 17s ├─ControlPlane - KubeadmControlPlane/mgmt-proxy-control-plane │ └─Machine/mgmt-proxy-control-plane-pvnzg True 12s └─Workers └─MachineDeployment/mgmt-proxy-md-0 └─Machine/mgmt-proxy-md-0-df8c9b68-b8dfb True 12s Providers: NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v0.3.14 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v0.3.14 capi-system cluster-api CoreProvider cluster-api v0.3.14 capv-system infrastructure-vsphere InfrastructureProvider vsphere v0.7.7 $ tanzu management-cluster get NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES mgmt-proxy tkg-system running 1/1 1/1 v1.20.5+vmware.1 management Details: NAME READY SEVERITY REASON SINCE MESSAGE /mgmt-proxy True 61s ├─ClusterInfrastructure - VSphereCluster/mgmt-proxy True 80s ├─ControlPlane - KubeadmControlPlane/mgmt-proxy-control-plane True 62s │ └─Machine/mgmt-proxy-control-plane-pvnzg True 75s └─Workers └─MachineDeployment/mgmt-proxy-md-0 └─Machine/mgmt-proxy-md-0-df8c9b68-b8dfb True 75s Providers: NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v0.3.14 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v0.3.14 capi-system cluster-api CoreProvider cluster-api v0.3.14 capv-system infrastructure-vsphere InfrastructureProvider vsphere v0.7.7 $ tanzu login ? Select a server mgmt-proxy () ✔ successfully logged in to management cluster using the kubeconfig mgmt-proxy $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE * mgmt-proxy-admin@mgmt-proxy mgmt-proxy mgmt-proxy-admin $ kubectl get nodes NAME STATUS ROLES AGE VERSION mgmt-proxy-control-plane-pvnzg Ready control-plane,master 5m18s v1.20.5+vmware.1 mgmt-proxy-md-0-df8c9b68-b8dfb Ready <none> 4m25s v1.20.5+vmware.1 $ kubectl get apps -A NAMESPACE NAME DESCRIPTION SINCE-DEPLOY AGE tkg-system antrea Reconcile succeeded 2m37s 2m38s tkg-system metrics-server Reconcile succeeded 2m37s 2m38s tkg-system tanzu-addons-manager Reconcile succeeded 2m26s 5m32s tkg-system vsphere-cpi Reconcile succeeded 2m24s 2m38s tkg-system vsphere-csi Reconcile succeeded 2m20s 2m38s
1635334052.674 897 10.35.13.140 TCP_TUNNEL/200 17034 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 - 1635334070.195 883 10.35.13.140 TCP_TUNNEL/200 16013 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 - 1635334073.462 860 10.35.13.140 TCP_TUNNEL/200 15563 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 - 1635334075.522 1045 10.35.13.140 TCP_TUNNEL/200 14602 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 - 1635334090.337 20086 10.35.13.140 TCP_TUNNEL/200 1391477 CONNECT 10.35.13.157:6443 - HIER_DIRECT/10.35.13.157 - 1635334090.452 84 10.35.13.140 TCP_TUNNEL/200 98468 CONNECT 10.35.13.157:6443 - HIER_DIRECT/10.35.13.157 - 1635334109.229 1014 10.35.13.140 TCP_TUNNEL/200 22500 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 - 1635334111.024 1066 10.35.13.140 TCP_TUNNEL/200 17419 CONNECT projects.registry.vmware.com:443 - HIER_DIRECT/10.188.25.227 -