Deploying Carvel packages on TKGS workload clusters in vSphere with Tanzu

I’ve posted a number of articles on this site which detail how to deploy TKG v1.4 Carvel packages on the multi-cloud version of TKG, often referred to as TKGm. But did you know that these packages can also be deployed to clusters provisioned by the TKG Service (TKGS) on vSphere with Tanzu? In this post, I will run through the steps on how to achieve this. You can find the official documentation here, which will be referred to from time to time during this post, especially for some of the manifests. It should also be noted that some of the packages, such as Grafana, Prometheus and Fluent Bit have not yet been validated on any releases later than vSphere 7.0U2, so we will focus initially on what is needed to deploy the Cert Manager package on my vSphere 7.0U3 environment. Note also that there is a requirement to utilize the tanzu CLI with vSphere with Tanzu, which we have seen in an earlier post when it was used for provisioning Tanzu Kubernetes workload clusters on vSphere with Tanzu. Let’s begin by taking a look at the full list of settings from my environment. In some cases, there are higher than the requirements, but I am posting here for reference.

Environment

  • vSphere 7.0U3c build 19234570
  • Existing vSphere with Tanzu Supervisor Cluster – v1.21.0+vmware.wcp.2-vsc0.0.12-18735554
  • Existing Tanzu Kubernetes workload cluster – v1.20.7+vmware.1-tkg.1.7fb9067
  • Existing NSX ALB / Avi Vantage Load Balancer – v20.1.5
  • Tanzu CLI – v1.4.1 buildDate: 2022-01-04
  • Additional Tanzu CLI tools – ytt / kapp / kbld / imgpkg

Set tanzu CLI context to TKC

Let’s now proceed with the deployment / setup steps. First, login to vSphere with Tanzu using the Supervisor cluster context. Then set the tanzu context to the  Supervisor cluster. Finally, login to vSphere Tanzu Kubernetes once more, this time setting the context to the workload cluster.

% kubectl-vsphere login --server xx.xx.62.16 --vsphere-username administrator@vsphere.local \
--insecure-skip-tls-verify

Logged in successfully.

You have access to the following contexts:
  xx.xx.62.16
  workload

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`


% kubectl config use-context xx.xx.62.16
Switched to context "xx.xx.62.16".


% tanzu login --context xx.xx.62.16
? Select a server xx.xx.62.16        ()
✔  successfully logged in to management cluster using the kubeconfig xx.xx.62.16


% kubectl-vsphere login --server xx.xx.62.16 --vsphere-username administrator@vsphere.local \
--insecure-skip-tls-verify \
--tanzu-kubernetes-cluster-namespace workload  \
--tanzu-kubernetes-cluster-name workload1

Logged in successfully.

You have access to the following contexts:
  xx.xx.62.16
  workload
  workload1

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`


% kubectl config use-context workload1
Switched to context "workload1".


% kubectl get nodes
NAME                                      STATUS  ROLES                   AGE   VERSION
workload1-control-plane-pdvqt             Ready    control-plane,master   5d    v1.20.7+vmware.1
workload1-workers-gwdmd-6794cd57d8-pj8lk  Ready    <none>                 5d    v1.20.7+vmware.1
workload1-workers-gwdmd-6794cd57d8-ts859  Ready    <none>                 96m   v1.20.7+vmware.1
workload1-workers-gwdmd-6794cd57d8-wjfxq  Ready    <none>                 96m   v1.20.7+vmware.1


% kubectl config use-context workload1
Switched to context "workload1".

Define a default StorageClass

One of the steps in the official docs is to make sure that you have a default storage class. To do this, you need to patch one of the storage classes that you assigned to the namespace where the TKC has been deployed. I only added one storage class, the vSAN default storage policy, so these are the steps to make this the default.

% kubectl get sc
NAME                          PROVISIONER              RECLAIMPOLICY  VOLUMEBINDINGMODE  ALLOWVOLUMEEXPANSION  AGE
vsan-default-storage-policy  csi.vsphere.vmware.com  Delete          Immediate          true                  5d


% kubectl patch storageclass vsan-default-storage-policy -p \
'{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/vsan-default-storage-policy patched


% kubectl get sc
NAME                                    PROVISIONER              RECLAIMPOLICY  VOLUMEBINDINGMODE  ALLOWVOLUMEEXPANSION  AGE
vsan-default-storage-policy (default)  csi.vsphere.vmware.com  Delete          Immediate          true                  5d

Create ClusterRole/ClusterRoleBinding for Service Accounts

Since the majority of the packages we are installing will have their own service accounts, we need to ensure that these service accounts have the appropriate privileges to carry out tasks, such as the ability to create Pods in the workload cluster. By default, there are two predefined pod security policies associated with the workload cluster.

% kubectl get psp
NAME                      PRIV    CAPS  SELINUX    RUNASUSER          FSGROUP    SUPGROUP    READONLYROOTFS  VOLUMES
vmware-system-privileged  true    *      RunAsAny  RunAsAny          RunAsAny    RunAsAny    false            *
vmware-system-restricted  false          RunAsAny  MustRunAsNonRoot  MustRunAs   MustRunAs   false            configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

While you could create your own PodSecurityPolicy, I am going to use one of the predefined ones, vmware-system-privileged. I will include this as a resource to a new ClusterRole, and then bind this new ClusterRole to the service accounts through a new ClusterRoleBinding, as follows:

% cat securityPolicy.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
 name: psp:privileged
rules:
- apiGroups: ['policy']
 resources: ['podsecuritypolicies']
 verbs: ['use']
 resourceNames:
 - vmware-system-privileged

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: all:psp:privileged
roleRef:
 kind: ClusterRole
 name: psp:privileged
 apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
 name: system:serviceaccounts
 apiGroup: rbac.authorization.k8s.io

Create the new ClusterRole and ClusterRoleBinding.

% kubectl apply -f securityPolicy.yaml
clusterrole.rbac.authorization.k8s.io/psp:privileged created
clusterrolebinding.rbac.authorization.k8s.io/all:psp:privileged created

If you wanted your service accounts to use a Pod Security Policy that was not quite so privileged, here is one that I managed to create which allowed the cert manager package to deploy successfully. You would obviously have to change the resourceName is the ClusterRole above to match this PSP name. Note that this policy may require additional entries for other packages, notably around networking and ports.

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tester-psp
spec:
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - configMap
    - emptyDir
    - projected
    - secret
    - downwardAPI
    - persistentVolumeClaim
  fsGroup:
    rule: RunAsAny
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny

Deploy the kapp-controller manifest

The kapp-controller is an integral part of Carvel package management. You can learn more about it here. Suffice to say that this is how we will be able to install and manage the TKG v1.4 packages on vSphere with Tanzu workload clusters.

There are two parts to this step. The first part is to create a Pod Security Policy for the kapp-controller service account, not to be confused with the PSP created earlier for the various package service accounts. The second step is to deploy the kapp-controller manifest. As we have seen, there are two PSPs by default. We will now create a third. This is the PSP for the kapp-controller service account which is available in the official documentation.

% cat tanzu-system-kapp-ctrl-restricted.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tanzu-system-kapp-ctrl-restricted
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - configMap
    - emptyDir
    - projected
    - secret
    - downwardAPI
    - persistentVolumeClaim
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: MustRunAs
    ranges:
      - min: 1
        max: 65535
  fsGroup:
    rule: MustRunAs
    ranges:
      - min: 1
        max: 65535
  readOnlyRootFilesystem: false

If we create this new Pod Security Policy, we should now observe a third PSPs on this cluster.

% kubectl apply -f tanzu-system-kapp-ctrl-restricted.yaml
podsecuritypolicy.policy/tanzu-system-kapp-ctrl-restricted created


% kubectl get psp
NAME                                PRIV    CAPS  SELINUX    RUNASUSER          FSGROUP    SUPGROUP    READONLYROOTFS  VOLUMES
tanzu-system-kapp-ctrl-restricted   false          RunAsAny  MustRunAsNonRoot  MustRunAs   MustRunAs   false            configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim
vmware-system-privileged            true    *      RunAsAny  RunAsAny          RunAsAny    RunAsAny    false            *
vmware-system-restricted            false          RunAsAny  MustRunAsNonRoot  MustRunAs   MustRunAs   false            configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

There is no need to manually create the ClusterRole or ClusterRoleBinding for this PSP as they are already included in the kapp-controller YAML manifest. The official documentation has a link to the kapp-controller YAML manifest required to deploy the kapp-controller, so I am not going to reproduce it here since it quite large. Once you have copied the manifest into a file locally, we can proceed with the deployment and apply it to the workload cluster. As you can see below, there are a significant number of package related Custom Resource Definitions (CRDs) added to the clusters, many of which we can query using kubectl, which we will see shortly.

% vi kapp-controller.yaml    ### paste the manifest contents here and save it


% kubectl apply -f kapp-controller.yaml
namespace/tkg-system created
namespace/tanzu-package-repo-global created
apiservice.apiregistration.k8s.io/v1alpha1.data.packaging.carvel.dev created
service/packaging-api created
customresourcedefinition.apiextensions.k8s.io/internalpackagemetadatas.internal.packaging.carvel.dev created
customresourcedefinition.apiextensions.k8s.io/internalpackages.internal.packaging.carvel.dev created
customresourcedefinition.apiextensions.k8s.io/apps.kappctrl.k14s.io created
customresourcedefinition.apiextensions.k8s.io/packageinstalls.packaging.carvel.dev created
customresourcedefinition.apiextensions.k8s.io/packagerepositories.packaging.carvel.dev created
configmap/kapp-controller-config created
deployment.apps/kapp-controller created
serviceaccount/kapp-controller-sa created
clusterrole.rbac.authorization.k8s.io/kapp-controller-cluster-role created
clusterrolebinding.rbac.authorization.k8s.io/kapp-controller-cluster-role-binding created
clusterrolebinding.rbac.authorization.k8s.io/pkg-apiserver:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/pkgserver-auth-reader created


% kubectl get pods -n tkg-system -w | grep kapp-controller
kapp-controller-5d8f7d9477-9z7n2   0/1     ContainerCreating   0          19s
kapp-controller-5d8f7d9477-9z7n2   1/1     Running             0          93s


% kubectl get all -n tkg-system
NAME                                   READY   STATUS    RESTARTS   AGE
pod/kapp-controller-5d8f7d9477-9z7n2   1/1     Running   0          7m48s

NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/packaging-api   ClusterIP   100.70.177.69   <none>        443/TCP   7m50s

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kapp-controller   1/1     1            1           7m49s

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/kapp-controller-5d8f7d9477   1         1         1       7m49s


% kubectl get crd
NAME                                                               CREATED AT
antreaagentinfos.clusterinformation.antrea.tanzu.vmware.com        2022-02-09T10:39:40Z
antreacontrollerinfos.clusterinformation.antrea.tanzu.vmware.com   2022-02-09T10:39:40Z
apps.kappctrl.k14s.io                                              2022-02-14T11:16:36Z
clusternetworkpolicies.security.antrea.tanzu.vmware.com            2022-02-09T10:39:40Z
externalentities.core.antrea.tanzu.vmware.com                      2022-02-09T10:39:40Z
internalpackagemetadatas.internal.packaging.carvel.dev             2022-02-14T11:16:36Z
internalpackages.internal.packaging.carvel.dev                     2022-02-14T11:16:36Z
networkpolicies.security.antrea.tanzu.vmware.com                   2022-02-09T10:39:40Z
packageinstalls.packaging.carvel.dev                               2022-02-14T11:16:37Z
packagerepositories.packaging.carvel.dev                           2022-02-14T11:16:37Z
tiers.security.antrea.tanzu.vmware.com                             2022-02-09T10:39:40Z
traceflows.ops.antrea.tanzu.vmware.com                             2022-02-09T10:39:41Z

Add the TKGv1.4 Package Repository

The next step is to use the tanzu CLI to add the TKGv1.4 package repository. Once the repository is added, we can query it and install packages directly from it.  I have given the repo a name called tkgv14repo. Once the repository has reconciled, you can query it using the tanzu CLI, or using the kubectl command as shown below. Once the reconcile has succeeded for the repo, we can begin to deploy packages from it.

% tanzu package repository add tkg14repo --url projects.registry.vmware.com/tkg/packages/standard/repo:v1.4.0
/ Adding package repository 'tkg14repo'...
 Added package repository 'tkg14repo'


% tanzu package repository list
/ Retrieving repositories...
  NAME       REPOSITORY                                                      STATUS       DETAILS
  tkg14repo  projects.registry.vmware.com/tkg/packages/standard/repo:v1.4.0  Reconciling


% tanzu package repository list
/ Retrieving repositories...
  NAME       REPOSITORY                                                      STATUS               DETAILS
  tkg14repo  projects.registry.vmware.com/tkg/packages/standard/repo:v1.4.0  Reconcile succeeded


% tanzu package available list
- Retrieving available packages...
  NAME                           DISPLAY-NAME  SHORT-DESCRIPTION
  cert-manager.tanzu.vmware.com  cert-manager  Certificate management
  contour.tanzu.vmware.com       Contour       An ingress controller
  external-dns.tanzu.vmware.com  external-dns  This package provides DNS synchronization functionality.
  fluent-bit.tanzu.vmware.com    fluent-bit    Fluent Bit is a fast Log Processor and Forwarder
  grafana.tanzu.vmware.com       grafana       Visualization and analytics software
  harbor.tanzu.vmware.com        Harbor        OCI Registry
  multus-cni.tanzu.vmware.com    multus-cni    This package provides the ability for enabling attaching multiple network interfaces to pods in Kubernetes
  prometheus.tanzu.vmware.com    prometheus    A time series database for your metrics


% kubectl get packages
NAME                                                 PACKAGEMETADATA NAME            VERSION                 AGE
cert-manager.tanzu.vmware.com.1.1.0+vmware.1-tkg.2   cert-manager.tanzu.vmware.com   1.1.0+vmware.1-tkg.2    8m30s
contour.tanzu.vmware.com.1.17.1+vmware.1-tkg.1       contour.tanzu.vmware.com        1.17.1+vmware.1-tkg.1   8m30s
external-dns.tanzu.vmware.com.0.8.0+vmware.1-tkg.1   external-dns.tanzu.vmware.com   0.8.0+vmware.1-tkg.1    8m30s
fluent-bit.tanzu.vmware.com.1.7.5+vmware.1-tkg.1     fluent-bit.tanzu.vmware.com     1.7.5+vmware.1-tkg.1    8m30s
grafana.tanzu.vmware.com.7.5.7+vmware.1-tkg.1        grafana.tanzu.vmware.com        7.5.7+vmware.1-tkg.1    8m30s
harbor.tanzu.vmware.com.2.2.3+vmware.1-tkg.1         harbor.tanzu.vmware.com         2.2.3+vmware.1-tkg.1    8m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.1-tkg.1     multus-cni.tanzu.vmware.com     3.7.1+vmware.1-tkg.1    8m30s
prometheus.tanzu.vmware.com.2.27.0+vmware.1-tkg.1    prometheus.tanzu.vmware.com     2.27.0+vmware.1-tkg.1   8m29s

As you can see from the list above, there are currently 8 packages that are available for installation. Note once again that some of these packages have not yet been validated on vSphere 7.0U3, as highlighted earlier in the official docs.

The packages come with a set of configurable variables, which can be retrieved using the tanzu CLI tools that were outlined in the prerequisites. Let’s look at how to retrieve those, and how to include them with a package deployment next.

Getting configurable parameters from a package

Since we are going to install the Certificate Manager package, we may as well look at the configuration of that particular package. To do that, we need to retrieve the package image. We can do that using the following command, although it is easier to store it in a variable for future use. I also deliberately chose the cert-manager package for this exercise as it is one of the simplest ones from a configuration perspective, in that the only configurable parameter is the namespace into which the package installs its objects, e.g. deployments, replicaSets, pods. The file in question is called values.yaml, and is under the config folder.

% kubectl get packages cert-manager.tanzu.vmware.com.1.1.0+vmware.1-tkg.2 \
-o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}'
projects.registry.vmware.com/tkg/packages/standard/cert-manager@sha256:117a35c2b496cbac0e6562d9b48cb821da829454e12c1ac7a6dedb6968c76de8%


% image_url=$(kubectl get packages cert-manager.tanzu.vmware.com.1.1.0+vmware.1-tkg.2 \
-o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}')


% echo $image_url
projects.registry.vmware.com/tkg/packages/standard/cert-manager@sha256:117a35c2b496cbac0e6562d9b48cb821da829454e12c1ac7a6dedb6968c76de8

% mkdir cert-mgr


% imgpkg pull -b $image_url -o ./cert-mgr
Pulling bundle 'projects.registry.vmware.com/tkg/packages/standard/cert-manager@sha256:117a35c2b496cbac0e6562d9b48cb821da829454e12c1ac7a6dedb6968c76de8'
  Extracting layer 'sha256:036fc2c9f0894cdd14bee9ee9099c22613e243b1d06b165734ea3b0014bfb0fe' (1/1)

Locating image lock file images...
The bundle repo (projects.registry.vmware.com/tkg/packages/standard/cert-manager) is hosting every image specified in the bundle's Images Lock file (.imgpkg/images.yml)

Succeeded

% ls -R ./cert-mgr
config

./cert-mgr/config:
_ytt_lib cert-manager.yaml overlays values.yaml

./cert-mgr/config/_ytt_lib:
bundle

./cert-mgr/config/_ytt_lib/bundle:
config

./cert-mgr/config/_ytt_lib/bundle/config:
overlays upstream values.yaml

./cert-mgr/config/_ytt_lib/bundle/config/overlays:
overlay-namespace.yaml

./cert-mgr/config/_ytt_lib/bundle/config/upstream:
cert-manager.yaml


./cert-mgr/config/overlays:
update-cert-deployment.yaml


% cat cert-mgr/config/values.yaml
#@data/values
---

#! The namespace in which to deploy cert-manager.
namespace: cert-manager

Deploy a package

We can now finally deploy a package from the TKG v1.4 repository to a Tanzu Kubernetes workload cluster on vSphere with Tanzu. Since we are going to stick with the default configuration values for this package (namespace set to cert-manager), there is no need to add a –values option to the command and specify our own bespoke configuration parameters file. Once the package is installed, its status can be queried via both the tanzu CLI and kubectl.

% tanzu package install cert-manager --package-name cert-manager.tanzu.vmware.com --version 1.1.0+vmware.1-tkg.2
/ Installing package 'cert-manager.tanzu.vmware.com'
| Getting namespace 'default'
/ Getting package metadata for 'cert-manager.tanzu.vmware.com'
| Creating service account 'cert-manager-default-sa'
| Creating cluster admin role 'cert-manager-default-cluster-role'
| Creating cluster role binding 'cert-manager-default-cluster-rolebinding'
\ Creating package resource
\ Package install status: Reconciling

 Added installed package 'cert-manager' in namespace 'default'


% kubectl get all -n cert-manager
NAME                                          READY  STATUS    RESTARTS   AGE
pod/cert-manager-5849447d4c-8hhfv              1/1    Running  0          91s
pod/cert-manager-cainjector-5557f7bb89-zw2rg   1/1    Running  0          91s
pod/cert-manager-webhook-77947cd8fb-v8vc7      1/1    Running  0          91s

NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP  PORT(S)    AGE
service/cert-manager          ClusterIP  100.70.175.194  <none>        9402/TCP   91s
service/cert-manager-webhook  ClusterIP  100.69.24.70    <none>        443/TCP    91s

NAME                                      READY  UP-TO-DATE  AVAILABLE   AGE
deployment.apps/cert-manager              1/1    1            1          91s
deployment.apps/cert-manager-cainjector   1/1    1            1          91s
deployment.apps/cert-manager-webhook      1/1    1            1          91s

NAME                                                 DESIRED  CURRENT  READY  AGE
replicaset.apps/cert-manager-5849447d4c              1        1        1      91s
replicaset.apps/cert-manager-cainjector-5557f7bb89   1        1        1      91s
replicaset.apps/cert-manager-webhook-77947cd8fb      1        1        1      91s


% tanzu package installed list
/ Retrieving installed packages...
  NAME          PACKAGE-NAME                  PACKAGE-VERSION      STATUS
  cert-manager  cert-manager.tanzu.vmware.com  1.1.0+vmware.1-tkg.2  Reconciling


% tanzu package installed list
/ Retrieving installed packages...
  NAME          PACKAGE-NAME                  PACKAGE-VERSION      STATUS
  cert-manager  cert-manager.tanzu.vmware.com  1.1.0+vmware.1-tkg.2  Reconcile succeeded


% kubectl get apps
NAME          DESCRIPTION          SINCE-DEPLOY  AGE
cert-manager  Reconcile succeeded  30s            10m

The cert-manager package has now been successfully installed on the workload cluster using the tanzu command line, and is available for use to any other applications that you may wish to deploy on this cluster. stay tuned, and I will go though the deployment of a Prometheus + Grafana monitoring stack on w vSphere with Tanzu workload cluster using the same method in a future post. However, using the example above, you should be able to achieve this now without too much difficulty. One thing to note is that understanding namespace use by both the Carvel packages and the objects that the package installs can be a bit confusing. Here is a piece I wrote about it for the Tanzu Community Edition, which is also applicable here.

Finally, please note that Tanzu Services is now available in VMware Cloud. VMware Cloud allows vSphere administrators to deploy Tanzu Kubernetes workload clusters for their DevOps teams without having to manage the underlying SDDC infrastructure. Read more about Tanzu Services here.