Tanzu Kubernetes Grid multi-cloud (TKGm) from the tkg Command Line Interface
After spending quite a bit of time looking at vSphere with Kubernetes, and how one could deploy a Tanzu Kubernetes Grid (TKG) “guest” cluster in a namespace with a simple manifest file, I thought it was time to look at other ways in which customers could deploy TKG clusters on top of vSphere infrastructure. In other words, deploy TKG without vSphere with Kubernetes, or VMware Cloud Foundation (VCF) for that matter. This post will look at TKG multi-cloud (TKGm) version 1.1.2 and in particular the tkg command line tool to first deploy a TKG management cluster, and once that is stood up, we will see how simple it is to deploy additional workload TKG clusters.
Now, there are a number of excellent posts out there on TKG already. Here is one from William Lam (a sort of tech preview). Here is another from Chip Zoller and another great one from Kendrick Coleman. While the tkg CLI allows you to deploy to both vSphere and AWS, I will focus on vSphere only in this post.
Cluster-API and Kind
Without getting into the weeds too much, this deployment mechanism relies heavily on Cluster-API, “a Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management”. A simple Kubernetes cluster is first stood up, and this is then used to deploy a more substantial Kubernetes cluster via Cluster-API, in our case the TKGm Management cluster. The simple Kubernetes cluster is based on Kind. Kind is a tool for running local Kubernetes clusters using Docker container “nodes”. Variations of the following visualization has been a used a lot to help understand how it all ties together, so I’ll share it here once more.
1. Prerequisites
It is envisioned that you should be able use your laptop or desktop or even a VM to deploy TKG. Whichever device you decide to use, you will need the following components available to successfully deploy the cluster.
- Docker will need to be installed on your device. There are many ways of doing this.
- The tkg CLI needs to be installed. This is available from MyVMware.
- The kubectl binary will need to be available as well. The version is dependent on the version of TKG cluster being deployed. This is checked and reported on at install time.
- 2 x OVA images, one for the Load Balancer/HA Proxy and one for the TKG cluster VMs are required. These are provided by VMware. Simply deploy the OVAs in your vSphere environment where you plan to deploy TKG, and convert them to templates. These are also available from MyVMware. Make sure you use the correct OVA versions that match the version of tkg CLI that you use.
- Keep a copy of your SSH public key nearby (normally in your HOME folder under .ssh/id_rsa.pub). There is no password authentication enabled on the TKG nodes so this is the only way you’ll be able to SSH to the nodes.
2. TKG Management Cluster Deployment
Launch the management cluster deployment using the tkg command line tool. The tkg init command starts the creation of the bootstrap cluster. The -u option launches a UI to allow you to select the infrastructure type, as well as add management cluster details.
$ tkg version Client: Version: v1.1.2 Git commit: c1db5bed7bc95e2ba32cf683c50525cdff0f2396 $ tkg init -u Logs of the command execution can also be found at: /tmp/tkg-20200702T123104396946940.log Validating the pre-requisites... Serving kickstart UI at http://127.0.0.1:8080
At this point, a .tkg folder is created in your home folder. This contains a bunch of components that will be used by the bootstrap process. Now, there is also a way to deploy TKG using a configuration YAML file, but we will look at how to do it via the UI first. Simply open a browser on your desktop/laptop/VM and point to http://127.0.0.1:8080, as shown above.
Let’s review the various inputs that you need to provide in the UI to successfully stand up a TKG management cluster. The first step is to select the infrastructure type where TKG will be deployed. Currently, the options are vSphere and AWS. We are going to do a vSphere deployment.
Next, add details for vSphere, namely the vCenter server, user name and password. Once populated, click connect to make sure you can reach the environment from your desktop/laptop/VM.
A word of caution – I am deploying TKG to my vSphere 7.0 environment. This is not supported, and the following message clearly states that. At the time of writing, 7.0 is not a supported platform for TKG standalone. If you wish to use TKG clusters with vSphere 7, you must use it in the context of vSphere with Kubernetes:
The last steps in this window are to select the Datacenter from the vSphere inventory where the TKG management cluster is to be provisioned, and add your public key to allow you to SSH to the TKG nodes.
Click Next to get to the Management Cluster Settings. Here you can select the Instance Type. If Development is chosen, a single control plane node and a single worker node is deployed. If Production is chosen, three control plane nodes and a single worker node is deployed. At this point, you also (optionally) add a management cluster name and select the API Server Load Balancer (aka HA Proxy). The HA Proxy is an imported OVA converted to a template which is listed in the prerequisites above. Note that different versions of TKG rely on different versions of OVA, so make sure the OVAs are compatible with the TKG release. Finally, select the worker node and HA proxy resources (or in other words, what size of VM do you wish to deploy?).
The next screen looks for vSphere resources, such as VM Folder and Datastore. Select and click next.
Next, select a VM network for the Kubernetes nodes. This network must have DHCP available, and also must be reachable from your desktop/laptop/VM (where you are deploying the TKG from). This has caught a few people out, including myself. The Cluster Pod and Service CIDRs can be left at the defaults.
Last but not least, pick the OVA template for the Kubernetes node OS image. This is the second OVA that you downloaded, deployed and converted to a template in the prerequisite steps. Click next to complete the wizard.
You can now Review the configuration.
And if everything looks good, we can click on the last step to deploy the TKG management cluster.
And now the UI displays that various steps involved in standing up the TKG Management cluster. As mentioned, first the kind cluster (Kubernetes using Docker) is stood up. This is then used to make declarative, Kubernetes-style API requests to bring up the TKG management cluster on vSphere. First, the kind boot strap cluster is configured.
At this point, you can run a docker command in the desktop/laptop/VM and see the bootstrap (kind) cluster:
$ docker ps CONTAINER ID IMAGE COMMAND \ CREATED STATUS PORTS NAMES 5192939d9c20 registry.tkg.vmware.run/kind/node:v1.18.3_vmware.1 "/usr/local/bin/entr…" \ 13 minutes ago Up 13 minutes 127.0.0.1:46861->6443/tcp tkg-kind-brush57avu7rf0bd8ed0-control-plane
Step 3 is to install the providers to the bootstrap cluster. Providers can be thought of as the components required to running Cluster-API on the different infrastructures, which in this case is vSphere.
In step 4, the bootstrap cluster is up and running. It now starts building the TKG management cluster.
Step 6 is adding the add-ons to the management cluster. Remember that we will use the TKG management cluster to quickly build out additional workload TKG clusters. Components needed to do that are added here.
Once the TKG management cluster is stood up, we no longer need the bootstrap cluster. This step updates the management cluster with the information in the bootstrap cluster before we delete it.
And at step 8, the TKG management cluster deployment is now complete.
Here is the complete tkg init -u output:
$ tkg init -u Logs of the command execution can also be found at: /tmp/tkg-20200702T123104396946940.log Validating the pre-requisites... Serving kickstart UI at http://127.0.0.1:8080 Validating configuration... web socket connection established sending pending 2 logs to UI Using infrastructure provider vsphere:v0.6.5 Generating cluster configuration... Setting up bootstrapper... Bootstrapper created. Kubeconfig: /home/cormac/.kube-tkg/tmp/config_FOd8QIxS Installing providers on bootstrapper... Fetching providers Installing cert-manager Waiting for cert-manager to be available... Installing Provider="cluster-api" Version="v0.3.6" TargetNamespace="capi-system" Installing Provider="bootstrap-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-bootstrap-system" Installing Provider="control-plane-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-control-plane-system" Installing Provider="infrastructure-vsphere" Version="v0.6.5" TargetNamespace="capv-system" Start creating management cluster... Saving management cluster kuebconfig into /home/cormac/.kube/config Installing providers on management cluster... Fetching providers Installing cert-manager Waiting for cert-manager to be available... Installing Provider="cluster-api" Version="v0.3.6" TargetNamespace="capi-system" Installing Provider="bootstrap-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-bootstrap-system" Installing Provider="control-plane-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-control-plane-system" Installing Provider="infrastructure-vsphere" Version="v0.6.5" TargetNamespace="capv-system" Waiting for the management cluster to get ready for move... Moving all Cluster API objects from bootstrap cluster to management cluster... Performing move... Discovering Cluster API objects Moving Cluster API objects Clusters=1 Creating objects in the target cluster Deleting objects from the source cluster Context set for management cluster tkg-mgmt-vsphere-20200702124420 as 'tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420'. Management cluster created! You can now create your first workload cluster by running the following: tkg create cluster [name] --kubernetes-version=[version] --plan=[plan]
What is interesting to review at this point is the .tkg/config.yaml file that was created as part of this process. Here is the original vanilla config.yaml that was created when the tkg init command was run.
$ cat .tkg/config.yaml cert-manager-timeout: 30m0s overridesFolder: /home/cormac/.tkg/overrides NODE_STARTUP_TIMEOUT: 20m BASTION_HOST_ENABLED: "true" providers: - name: cluster-api url: /home/cormac/.tkg/providers/cluster-api/v0.3.6/core-components.yaml type: CoreProvider - name: aws url: /home/cormac/.tkg/providers/infrastructure-aws/v0.5.4/infrastructure-components.yaml type: InfrastructureProvider - name: vsphere url: /home/cormac/.tkg/providers/infrastructure-vsphere/v0.6.5/infrastructure-components.yaml type: InfrastructureProvider - name: tkg-service-vsphere url: /home/cormac/.tkg/providers/infrastructure-tkg-service-vsphere/v1.0.0/unused.yaml type: InfrastructureProvider - name: kubeadm url: /home/cormac/.tkg/providers/bootstrap-kubeadm/v0.3.6/bootstrap-components.yaml type: BootstrapProvider - name: kubeadm url: /home/cormac/.tkg/providers/control-plane-kubeadm/v0.3.6/control-plane-components.yaml type: ControlPlaneProvider images: all: repository: registry.tkg.vmware.run/cluster-api cert-manager: repository: registry.tkg.vmware.run/cert-manager tag: v0.11.0_vmware.1 release: version: v1.1.2
Now, after running the setup of the TKG management domain, this is what we see updated in the config.yaml:
$ cat .tkg/config.yaml cert-manager-timeout: 30m0s overridesFolder: /home/cormac/.tkg/overrides NODE_STARTUP_TIMEOUT: 20m BASTION_HOST_ENABLED: "true" providers: - name: cluster-api url: /home/cormac/.tkg/providers/cluster-api/v0.3.6/core-components.yaml type: CoreProvider - name: aws url: /home/cormac/.tkg/providers/infrastructure-aws/v0.5.4/infrastructure-components.yaml type: InfrastructureProvider - name: vsphere url: /home/cormac/.tkg/providers/infrastructure-vsphere/v0.6.5/infrastructure-components.yaml type: InfrastructureProvider - name: tkg-service-vsphere url: /home/cormac/.tkg/providers/infrastructure-tkg-service-vsphere/v1.0.0/unused.yaml type: InfrastructureProvider - name: kubeadm url: /home/cormac/.tkg/providers/bootstrap-kubeadm/v0.3.6/bootstrap-components.yaml type: BootstrapProvider - name: kubeadm url: /home/cormac/.tkg/providers/control-plane-kubeadm/v0.3.6/control-plane-components.yaml type: ControlPlaneProvider images: all: repository: registry.tkg.vmware.run/cluster-api cert-manager: repository: registry.tkg.vmware.run/cert-manager tag: v0.11.0_vmware.1 release: version: v1.1.2 VSPHERE_FOLDER: /Datacenter/vm/Discovered virtual machine VSPHERE_WORKER_MEM_MIB: "4096" VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa xxxxxxxxxx cormac@pks-cli.rainpole.com CLUSTER_CIDR: 100.96.0.0/11 VSPHERE_HAPROXY_TEMPLATE: /Datacenter/vm/Templates/photon-3-haproxy-v1.2.4+vmware.1 VSPHERE_NETWORK: VM Network VSPHERE_RESOURCE_POOL: /Datacenter/host/OCTO-Cluster/Resources VSPHERE_PASSWORD: <encoded:Vk13YXJlMTIzIQ==> VSPHERE_CONTROL_PLANE_NUM_CPUS: "2" VSPHERE_DATASTORE: /Datacenter/datastore/vsanDatastore VSPHERE_CONTROL_PLANE_DISK_GIB: "40" VSPHERE_CONTROL_PLANE_MEM_MIB: "4096" VSPHERE_HA_PROXY_MEM_MIB: "4096" SERVICE_CIDR: 100.64.0.0/13 VSPHERE_SERVER: 10.27.51.106 VSPHERE_DATACENTER: /Datacenter VSPHERE_WORKER_NUM_CPUS: "2" VSPHERE_HA_PROXY_DISK_GIB: "40" VSPHERE_HA_PROXY_NUM_CPUS: "2" VSPHERE_USERNAME: administrator@vsphere.local VSPHERE_WORKER_DISK_GIB: "40" tkg: regions: - name: tkg-mgmt-vsphere-20200702124420 context: tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420 file: /home/cormac/.kube-tkg/config isCurrentContext: false current-region-context: tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420
This config.yaml can now be used to deploy future TKG management clusters directly from the command line, rather than from the UI.
3. Query TKG Management Cluster Deployment
At this point, the TKG management cluster has been now been deployed to my vSphere environment. We can query it as follows:
$ tkg get management-cluster MANAGEMENT-CLUSTER-NAME CONTEXT-NAME tkg-mgmt-vsphere-20200702124420 * tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420
I requested a “dev” deployment which provides a single control plane VM and a single worker VM. If we switch contexts to the management cluster, we can verify that.
$ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420-admin $ kubectl config use-context tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420 Switched to context "tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420". $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE * tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420-admin $ kubectl get nodes NAME STATUS ROLES AGE VERSION tkg-mgmt-vsphere-20200702124420-control-plane-ftrdz Ready master 111m v1.18.3+vmware.1 tkg-mgmt-vsphere-20200702124420-md-0-cfd484f8b-xhcjc Ready <none> 108m v1.18.3+vmware.1
Obviously the management cluster is also visible in the vSphere inventory:
The TKG management cluster is provisioned as was requested previously. Let’s turn our attention to now deploying a TKG workload cluster.
4. TKG Workload Cluster Deployment
All of this can be done from the tkg command line. Once again, I am going to deploy a simple cluster with one control plane VM and one worker VM. To do that, I specify a plan called “dev” in the tkg command (the alternative plan is “prod”). After creating the cluster, I use another tkg command to retrieve the credentials. This automatically populates my KUBECONFIG. After changing contexts to the workload cluster, I can query the nodes, and once again see that there is a single master and a single worker.
$ tkg create cluster cormac-workload --plan dev Logs of the command execution can also be found at: /tmp/tkg-20200701T171627162185280.log Validating configuration... Creating workload cluster 'cormac-workload'... Waiting for cluster to be initialized... Waiting for cluster nodes to be available... Workload cluster 'cormac-workload' created $ tkg get cluster NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES cormac-workload default running 1/1 1/1 v1.18.3+vmware.1 $ tkg get credentials cormac-workload Credentials of workload cluster 'cormac-workload' have been saved You can now access the cluster by running 'kubectl config use-context cormac-workload-admin@cormac-workload' $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE cormac-workload-admin@cormac-workload cormac-workload cormac-workload-admin * tkg-mgmt-vsphere-20200702124420-admin@tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420 tkg-mgmt-vsphere-20200702124420-admin $ kubectl config use-context cormac-workload-admin@cormac-workload Switched to context "cormac-workload-admin@cormac-workload". $ kubectl get nodes NAME STATUS ROLES AGE VERSION cormac-workload-control-plane-9g2n9 Ready master 3m4s v1.18.3+vmware.1 cormac-workload-md-0-5475745498-ncstb Ready <none> 89s v1.18.3+vmware.1
Now you can begin to use this workload cluster for applications. You can begin to build Storage Classes, Persistent Volumes, Persistent Volume Claims, Pods, and so on. And of course, you can run many more tkg commands to build additional workload clusters. In the example above, I created only the simplest of clusters. If you select the “prod” or production plan, you can also specify requirements such as the number of control plane nodes, the number of worker nodes, and the image that should be used for the nodes. Here is an example of a workload cluster with 3 control plane nodes and 5 worker nodes.
$ tkg create cluster cormac-workload --plan=prod -c 3 -w 5 Logs of the command execution can also be found at: /tmp/tkg-20200703T095705874206587.log Validating configuration... Creating workload cluster 'cormac-workload'... Waiting for cluster to be initialized... Waiting for cluster nodes to be available... Workload cluster 'cormac-workload' created $
And this is what that cluster looks like in the vSphere inventory:
$ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE * tkg-mgmt-vsphere-20200703085854-admin@tkg-mgmt-vsphere-20200703085854 tkg-mgmt-vsphere-20200703085854 tkg-mgmt-vsphere-20200703085854-admin $ tkg get credentials cormac-workload Credentials of workload cluster 'cormac-workload' have been saved You can now access the cluster by running 'kubectl config use-context cormac-workload-admin@cormac-workload' $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE cormac-workload-admin@cormac-workload cormac-workload cormac-workload-admin * tkg-mgmt-vsphere-20200703085854-admin@tkg-mgmt-vsphere-20200703085854 tkg-mgmt-vsphere-20200703085854 tkg-mgmt-vsphere-20200703085854-admin $ kubectl config use-context cormac-workload-admin@cormac-workload Switched to context "cormac-workload-admin@cormac-workload". $ kubectl get nodes NAME STATUS ROLES AGE VERSION cormac-workload-control-plane-8jgrn Ready master 3m1s v1.18.3+vmware.1 cormac-workload-control-plane-96mnz Ready master 5m46s v1.18.3+vmware.1 cormac-workload-control-plane-cvrj6 Ready master 10m v1.18.3+vmware.1 cormac-workload-md-0-5475745498-2pg9d Ready <none> 5m49s v1.18.3+vmware.1 cormac-workload-md-0-5475745498-md8vn Ready <none> 5m46s v1.18.3+vmware.1 cormac-workload-md-0-5475745498-q75st Ready <none> 5m43s v1.18.3+vmware.1 cormac-workload-md-0-5475745498-vfq6t Ready <none> 5m43s v1.18.3+vmware.1 cormac-workload-md-0-5475745498-xjm9d Ready <none> 5m51s v1.18.3+vmware.1
5. TKG Cleanup
Cleaning up TKG is a two-folder process. First remove the workload clusters, then delete the management cluster. After deleting the workload cluster, I changed contexts to the management cluster just to query the clusters from kubectl. This step is not necessary – it is just another way of querying the cluster.
One other interesting point is that the kind bootstrap cluster is stood up once again. I assume this is so we can switch to that cluster in order to be able to full delete the management cluster, rather than try to delete the object that we are currently communicating to.
$ tkg get cluster NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES cormac-workload default running 1/1 1/1 v1.18.3+vmware.1 $ tkg delete cluster cormac-workload Deleting workload cluster 'cormac-workload'. Are you sure?: y█ workload cluster cormac-workload is being deleted $ tkg get cluster NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE tkgmgmt-admin@tkgmgmt tkgmgmt tkgmgmt-admin $ kubectl config use-context tkgmgmt-admin@tkgmgmt Switched to context "tkgmgmt-admin@tkgmgmt". $ kubectl get cluster -A NAMESPACE NAME PHASE tkg-system tkgmgmt Provisioned $ tkg get management-cluster MANAGEMENT-CLUSTER-NAME CONTEXT-NAME tkgmgmt * tkgmgmt-admin@tkgmgmt $ tkg delete management-cluster tkgmgmt Logs of the command execution can also be found at: /tmp/tkg-20200702T122047044190616.log Deleting management cluster 'tkgmgmt'. Are you sure?: y█ Verifying management cluster... Setting up cleanup cluster... Installing providers to cleanup cluster... Fetching providers Installing cert-manager Waiting for cert-manager to be available... Installing Provider="cluster-api" Version="v0.3.6" TargetNamespace="capi-system" Installing Provider="bootstrap-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-bootstrap-system" Installing Provider="control-plane-kubeadm" Version="v0.3.6" TargetNamespace="capi-kubeadm-control-plane-system" Installing Provider="infrastructure-vsphere" Version="v0.6.5" TargetNamespace="capv-system" Moving Cluster API objects from management cluster to cleanup cluster... Performing move... Discovering Cluster API objects Moving Cluster API objects Clusters=1 Creating objects in the target cluster Deleting objects from the source cluster Waiting for the Cluster API objects to get ready after move... Deleting regional management cluster... Management cluster 'tkgmgmt' deleted. Deleting the management cluster context from the kubeconfig file '/home/cormac/.kube/config' warning: this removed your active context, use "kubectl config use-context" to select a different one Management cluster deleted! $
And at this point, all traces of the TKG clusters are removed.
6. TKG Deployment Troubleshooting
There may be a need to see why the TKH management cluster did not deploy successfully. One of the ways of doing this is the login to the bootstrap cluster / Kind control plane container, using a docker exec commnd as follows:
$ docker ps CONTAINER ID IMAGE COMMAND \ CREATED STATUS PORTS NAMES e9b5fb8b986a registry.tkg.vmware.run/kind/node:v1.18.3_vmware.1 "/usr/local/bin/entr…" \ 2 minutes ago Up 2 minutes 127.0.0.1:45919->6443/tcp tkg-kind-brveafnavu7g23j0t5e0-control-plane $ docker exec -it e9b5fb8b986a /bin/bash root@tkg-kind-brveafnavu7g23j0t5e0-control-plane:/#
Now you can look at the various log files for the bootstrap cluster. These are all located in /var/log.
root@tkg-kind-brveafnavu7g23j0t5e0-control-plane:/# cd /var/log root@tkg-kind-brveafnavu7g23j0t5e0-control-plane:/var/log# ls pods capi-kubeadm-bootstrap-system_capi-kubeadm-bootstrap-controller-manager-6857dfc668-zk67x_4300c6dc-3f50-41cd-99d8-d4203419c420 capi-kubeadm-control-plane-system_capi-kubeadm-control-plane-controller-manager-85f4885cf5-trh22_2332d4da-dd31-444b-b96e-5de1ec1ddb15 capi-system_capi-controller-manager-5df8c8fb59-l47fm_4e8a98df-4af9-40eb-a418-503e7a668a14 capi-webhook-system_capi-controller-manager-7d8d9b87b8-6hq9x_4568906b-4ce9-4a89-a11f-eb2400525f0a capi-webhook-system_capi-kubeadm-bootstrap-controller-manager-dff99d987-l4vr6_ef3f7ef8-0dcf-4aef-b609-c12103be59d5 capi-webhook-system_capi-kubeadm-control-plane-controller-manager-6cc995dd6c-wgcsd_d563e990-4c77-4769-9a96-548b9d4f1e24 capi-webhook-system_capv-controller-manager-5cdc58c9ff-c54h6_001d37d5-d68d-4afa-9589-a2bd3d4e622d capv-system_capv-controller-manager-546d5b4b78-xznd2_3f607f72-fe7c-4577-ac58-3630c4e9d492 cert-manager_cert-manager-b56b4dc78-lp22b_875bee0e-8f0e-48d7-bd84-0ec0ecf24ff7 cert-manager_cert-manager-cainjector-6b54f84d85-qgjbp_8b8223d7-90d4-42b2-8866-3003004326aa cert-manager_cert-manager-webhook-6fbc6d7449-qmqgm_fd9e4e02-12cd-4c1b-8a74-0f24b0e86043 kube-system_coredns-dbbffcb66-c8mz4_7540534d-5ebf-46e5-b3a7-808d253e4152 kube-system_coredns-dbbffcb66-mgh54_cd93a10e-c34a-4e7c-a4d0-8762b70fe68c kube-system_etcd-tkg-kind-brveafnavu7g23j0t5e0-control-plane_90686900a9bab245caac254dd65ae1a6 kube-system_kindnet-9tk2p_1abf6881-ef5d-4646-a375-0f5d6b3e5916 kube-system_kube-apiserver-tkg-kind-brveafnavu7g23j0t5e0-control-plane_ee85bcff6177159f1125b2226dcad9b5 kube-system_kube-controller-manager-tkg-kind-brveafnavu7g23j0t5e0-control-plane_fb74d22435982c7fbf6da8e6f249e2c9 kube-system_kube-proxy-hv2r7_36381438-854b-49e3-8d66-4f7739e22729 kube-system_kube-scheduler-tkg-kind-brveafnavu7g23j0t5e0-control-plane_f3f4e6af5f879aff5937d22e94bac1f7 local-path-storage_local-path-provisioner-774f7f8fdb-7hsnc_ba3591cd-582e-4d6a-b10d-86ffe757c3f7 root@tkg-kind-brveafnavu7g23j0t5e0-control-plane:/var/log# ls containers capi-controller-manager-5df8c8fb59-l47fm_capi-system_kube-rbac-proxy-9b15f6205a69b0ed9ea1bbd340638173b28cb38921aa19bfd6fb2a7ca3557840.log capi-controller-manager-5df8c8fb59-l47fm_capi-system_manager-2d1c9b815014c12f2d2ae8af2bc23545027f0db2d22616e7aac5356effae4f87.log capi-controller-manager-7d8d9b87b8-6hq9x_capi-webhook-system_kube-rbac-proxy-d485df7899eba33eede51e72c8dd7fcc5a7b5c4aca477c8250243054012f797b.log capi-controller-manager-7d8d9b87b8-6hq9x_capi-webhook-system_manager-c48c3ea0ad7dd0f5dca73a33131bf5d80019eab149f6e861be19351f0c7ac985.log capi-kubeadm-bootstrap-controller-manager-6857dfc668-zk67x_capi-kubeadm-bootstrap-system_kube-rbac-proxy-310665e0515da7f65a0ba03bbea227e6575f338fa5b634fb29ebb59289e35e46.log capi-kubeadm-bootstrap-controller-manager-6857dfc668-zk67x_capi-kubeadm-bootstrap-system_manager-99e238b941406409f404b632dd1740152b5d98f406a808819f0c6fc556bda1ff.log capi-kubeadm-bootstrap-controller-manager-dff99d987-l4vr6_capi-webhook-system_kube-rbac-proxy-dde7c58b183dcc2d516447a7932a727f909fb96fb1cad89a961468d28f74c142.log capi-kubeadm-bootstrap-controller-manager-dff99d987-l4vr6_capi-webhook-system_manager-4cfcba0bf1c76530c8b6116c61fe6a15f09d7cd3c157c1b109ca75da5ded93f5.log capi-kubeadm-control-plane-controller-manager-6cc995dd6c-wgcsd_capi-webhook-system_kube-rbac-proxy-b8afb5f7b555fdbd40e2407d8cb102e590ba6d7a6967b146f68789fe0bed6ffd.log capi-kubeadm-control-plane-controller-manager-6cc995dd6c-wgcsd_capi-webhook-system_manager-096060da6ca21b1f3b0805b2a440c1fe59b4a12d18b5aeae2fdcf781ef6f3ad9.log capi-kubeadm-control-plane-controller-manager-6cc995dd6c-wgcsd_capi-webhook-system_manager-9944a9dfeec54333536a08646868fe2e1e1b3ee6099f3e94a8bd8ff39bdaa230.log capi-kubeadm-control-plane-controller-manager-85f4885cf5-trh22_capi-kubeadm-control-plane-system_kube-rbac-proxy-d56a61de271e47500144c2f99b6f1087a145c4a9a48024cfc4df4fd813f4e656.log capi-kubeadm-control-plane-controller-manager-85f4885cf5-trh22_capi-kubeadm-control-plane-system_manager-c56da1c0852fe5a557d31bb7aff6b49166075ea7f6ec94ea11255d4d7a957e2b.log capv-controller-manager-546d5b4b78-xznd2_capv-system_kube-rbac-proxy-a6ec83a76cc218412087e02aebe18df758d261032d8980777347d507425b6163.log capv-controller-manager-546d5b4b78-xznd2_capv-system_manager-7f755d80002ba9f7b115662c68d3c5603a51e495741a47b8668818da8f47fa1d.log capv-controller-manager-5cdc58c9ff-c54h6_capi-webhook-system_kube-rbac-proxy-a32432af21eb459ab1cdae2fcb827c2631cfdad4f89480faf46eb9c18063b955.log capv-controller-manager-5cdc58c9ff-c54h6_capi-webhook-system_manager-d21d2386d64424b891f45aa869f042a2df84401864f0e152f4365f61c79bd631.log cert-manager-b56b4dc78-lp22b_cert-manager_cert-manager-ddd1b4fc3a639d0baf6e08ef3d03e94f6d201f06870dde7fe0fd7d8969d2bb97.log cert-manager-cainjector-6b54f84d85-qgjbp_cert-manager_cainjector-7190813bdac45dc26838c47cb3b31d087db8c6d03024631cfdac400f4112c3b9.log cert-manager-webhook-6fbc6d7449-qmqgm_cert-manager_cert-manager-ce208041b82f8e8d3f1a557ce448b387c3d5800e25a12b9f1f477f9ca49e91c5.log cert-manager-webhook-6fbc6d7449-qmqgm_cert-manager_cert-manager-f0dee859352a94e3a11f516eb616c37af9cd4abf14e44eef2ab64a1fc7abea15.log coredns-dbbffcb66-c8mz4_kube-system_coredns-7fff45ced9afca22169f9d81b41b5cd87a62352b55aa90d406b3d7e2f1d1878c.log coredns-dbbffcb66-mgh54_kube-system_coredns-d1fbb9b15590eeb5e4ce776b7f570820413347db90956af4a8a0a461b7d392a0.log etcd-tkg-kind-brveafnavu7g23j0t5e0-control-plane_kube-system_etcd-9b7343b8a053e720c8acaf9179ff455869415057e9f0243be004f9cae433ace5.log kindnet-9tk2p_kube-system_kindnet-cni-78de1eeacb92283157c7498fb0a1bbabf1e46691d6f5474842204154dea7f9fc.log kube-apiserver-tkg-kind-brveafnavu7g23j0t5e0-control-plane_kube-system_kube-apiserver-6672c0be051251a783a63c86ed7c51cef15e5be2d02b62f50b101d454e30e55c.log kube-controller-manager-tkg-kind-brveafnavu7g23j0t5e0-control-plane_kube-system_kube-controller-manager-99d8f7c5b66b32826c93871e4967ad015adec1c8516818c8ecd7fd08f45f4585.log kube-proxy-hv2r7_kube-system_kube-proxy-84733041e786bbc53d07fcce291f7db54da22a80b6cb915afdb5658cab013928.log kube-scheduler-tkg-kind-brveafnavu7g23j0t5e0-control-plane_kube-system_kube-scheduler-a6deb97edc6aa4d5cbe68e2f29b9639e7cf8cc060938721d6001efe4299461d0.log local-path-provisioner-774f7f8fdb-7hsnc_local-path-storage_local-path-provisioner-b01f701fefab582d1b22d30a8b8eaee9bb9780f53ef68a0bf5b9d314210f32b8.log root@tkg-kind-brveafnavu7g23j0t5e0-control-plane:/var/log#
For deeper troubleshooting consider the Crash Recovery and Diagnostics for Kubernetes (crashd for short). This will collect all necessary logs from the cluster for you, rather than you having to exec to the cluster as shown previously. More crashd detail can be found here.
I hope this has given you a good grasp of the TKG (standalone) product, and how it can be used to very simply and very quickly deploy Kubernetes clusters.
I always hit the hang,and will failed after time out.
I0911 21:43:52.497055 installer.go:118] Creating inventory entry Provider=”control-plane-kubeadm” Version=”v0.3.6″ TargetNamespace=”capi-kubeadm-control-plane-system”
I0911 21:43:52.613973 installer.go:82] Installing Provider=”infrastructure-vsphere” Version=”v0.6.6″ TargetNamespace=”capv-system”
I0911 21:43:52.720940 installer.go:101] Creating shared objects Provider=”infrastructure-vsphere” Version=”v0.6.6″
I0911 21:43:56.294243 installer.go:113] Creating instance objects Provider=”infrastructure-vsphere” Version=”v0.6.6″ TargetNamespace=”capv-system”
I0911 21:43:58.019277 installer.go:118] Creating inventory entry Provider=”infrastructure-vsphere” Version=”v0.6.6″ TargetNamespace=”capv-system”
I0911 21:43:58.168834 init.go:324] installed Component==”cluster-api” Type==”CoreProvider” Version==”v0.3.6″
I0911 21:43:58.168988 init.go:324] installed Component==”kubeadm” Type==”BootstrapProvider” Version==”v0.3.6″
I0911 21:43:58.169125 init.go:324] installed Component==”kubeadm” Type==”ControlPlaneProvider” Version==”v0.3.6″
I0911 21:43:58.169197 init.go:324] installed Component==”vsphere” Type==”InfrastructureProvider” Version==”v0.6.6″
I0911 21:43:58.381765 clusterclient.go:835] Waiting for resource capi-kubeadm-bootstrap-controller-manager of type *v1.Deployment to be up and running