Using Tanzu Mission Control Data Protection with on-premises S3 (MinIO)
Today, we will look at another feature of Tanzu Mission Control: Data Protection. In an earlier post, we saw how Tanzu Mission Control, or TMC for short, can be used to manage and create clusters on vSphere that have Identity Management integrated with LDAP/Active Directory. We also saw how TMC managed Tanzu Kubernetes clusters on vSphere utilized the NSX ALB for Load Balancing services in that same post. Now we will deploy an S3 Object Store from MinIO to an on-premises Tanzu Kubernetes cluster. This will then become the “backup target” for TMC Data Protection. TMC Data Protection uses the Velero product from VMware. Once configured, we will initiate a simple backup and restore operation from Tanzu Mission Control to ensure that everything is working as expected.
This is an opportune post and I just noticed some new TMC enhancements around Data Protection in the latest release notes. It seems that the data protection features of Tanzu Mission Control now allow you to restore a selected namespace from a cluster backup, as well as allowing you to restore without overwriting an existing namespace. These are some nice new features for sure.
While there is not too much text associated with this post, there are quite a number of screenshots, so it is a bit on a lengthy post. Hope you find it useful all the same.
Step 1: Deploy MinIO Operator
I have used an on-premises, Tanzu Kubernetes cluster to host a MinIO S3 Object Store running on its own dedicated Kubernetes cluster. This is deploying onto vSphere 7.0U2, and uses an NSX ALB for load balancer services. Note that there are 4 worker nodes. This appears to be a MinIO requirement for minimal erasure coding when creating a tenant, as we will see later.
I used the new MinIO Operator for this deployment. It has two methods to get started, and I have found the krew method the easier, perhaps because I am already familiar with it. Once the MinIO Operator is installed, the command kubectl minio is available to interact with the operator. We can use the operator to initialize MinIO on the cluster, and then create our first MinIO tenant with its own S3 Object Store.
% kubectl minio Deploy and manage the multi tenant, S3 API compatible object storage on Kubernetes Usage: minio [command] Available Commands: init Initialize MinIO Operator tenant Manage MinIO tenant(s) delete Delete MinIO Operator proxy Open a port-forward to Console UI version Display plugin version Flags: -h, --help help for minio --kubeconfig string Custom kubeconfig path Use "minio [command] --help" for more information about a command. %
To install the Minio, ensure that the Kubernetes configuration context is pointing to your Minio Kubernetes cluster, and then run the following command.
% kubectl minio init namespace/minio-operator created serviceaccount/minio-operator created clusterrole.rbac.authorization.k8s.io/minio-operator-role created clusterrolebinding.rbac.authorization.k8s.io/minio-operator-binding created customresourcedefinition.apiextensions.k8s.io/tenants.minio.min.io created service/operator created deployment.apps/minio-operator created serviceaccount/console-sa created clusterrole.rbac.authorization.k8s.io/console-sa-role created clusterrolebinding.rbac.authorization.k8s.io/console-sa-binding created configmap/console-env created service/console created deployment.apps/console created ----------------- To open Operator UI, start a port forward using this command: kubectl minio proxy -n minio-operator ----------------- %
Notice the directive at the end of the previous command output to open the Operator UI. We will do that next. First though, verify that the operator is successfully deployed by checking the status of the following pods.
% kubectl get pods -n minio-operator NAME READY STATUS RESTARTS AGE console-59d589b9b7-mfqp8 1/1 Running 0 2m57s minio-operator-6447586784-rk4mv 1/1 Running 0 3m1s %
Step 2: Create the first MinIO tenant
The init output previously states that to open the Operator UI, start a port forward using the following command. Although it is possible to create tenants from the kubectl minio tenant command, we will use the UI as it seems to be a bit more intuitive, and there are some advanced options we need to configure. Let’s run the port forward.
% kubectl minio proxy -n minio-operator Starting port forward of the Console UI. To connect open a browser and go to http://localhost:9090 Current JWT to login: eyJhbGciOiJ......weZPjNNbQ Forwarding from 0.0.0.0:9090 -> 9090
From a browser on our desktop, connect to http://localhost:9090 to launch the Operator UI. JWT is the JSON Web Token that we will use to login to the UI. It has been truncated/obfuscated in the above output.
Copy and paste the JWT from the console output to the Operator Login. On initial login, there are no existing tenants. Click on the + to add a new one which we will use for Velero backups.
When creating a new tenant, you need to provide a name, a namespace (click + to add a new one if it does not already exist). You must also provide a storage class (I just chose the default that was already configured on the TKG cluster where MinIO is deployed. You might also wish to reduce the total size of the disk drives (default is 100Gi). I have reduced this to 10Gi for my testing. There is a requirement to have a minimum of 4 servers, which is met by having 4 worker nodes in the TKG cluster.
The remaining settings can be left at their defaults apart from Security. I have disabled TLS (Transport Layer Security) on this tenant. This is fine for a proof-of-concept. For production, TLS should be enabled to establish trust between Velero and the MinIO S3 Object Store. The official Velero documentation describes what steps are needed to enable Velero and the S3 Object Store to establish trust through self-signed certs. This is something that I believe is being scoped to include in the Tanzu Mission Control interface as some point in the future, but at present it is a manual step. Thus, we will continue to TLS disabled.
This is a review of the configuration I created of the new tenant. We can go ahead now create it.
When the tenant is successfully created, you are provided with an Access Key and a Secret Key to access it via a pop-up window. Store these safely as we will reuse them in the remaining exercise. You can monitor the creation of the tenant. Click on the tenant name (velero) to see more information.
Eventually the Summary view should show the tenant as healthy. It also provides a Console link and the Minio Endpoint. These Load Balancer IP Addresses come from the NSX ALB in this environment. We can use the Console link to manage the tenant, whilst the endpoint is what Velero uses as a target to ship its backup information.
Lets take a quick look at the Kubernetes cluster where MinIO is running. Some new objects should now be visible in the chosen namespace, in our case velero. Note that in the first output, listing the services, there are Load Balancer / External IP addresses for both the minio endpoint and the console. These match what we see in the summary details above.
% kubectl get svc -n velero NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE minio LoadBalancer 10.96.91.2 xx.yy.zz.154 80:31365/TCP 2m11s velero-console LoadBalancer 10.96.81.5 xx.yy.zz.155 9090:30882/TCP 2m10s velero-hl ClusterIP None <none> 9000/TCP 2m10s velero-log-hl-svc ClusterIP None <none> 5432/TCP 45s velero-log-search-api ClusterIP 10.96.15.197 <none> 8080/TCP 45s % kubectl get pods -n velero NAME READY STATUS RESTARTS AGE velero-log-0 1/1 Running 0 79s velero-log-search-api-dc985c94b-l4szs 1/1 Running 3 79s velero-pool-0-0 1/1 Running 0 2m43s velero-pool-0-1 1/1 Running 0 2m43s velero-pool-0-2 1/1 Running 0 2m43s velero-pool-0-3 1/1 Running 0 2m43s velero-prometheus-0 0/2 Init:0/1 0 19s % kubectl get deploy -n velero NAME READY UP-TO-DATE AVAILABLE AGE velero-log-search-api 1/1 1 1 87s % kubectl get rs -n velero NAME DESIRED CURRENT READY AGE velero-log-search-api-dc985c94b 1 1 1 92s % kubectl get sts -n velero NAME READY AGE velero-log 1/1 98s velero-pool-0 4/4 3m2s
We can also use the MinIO Operator to examine the tenant configuration.
% kubectl minio tenant list Tenant 'velero', Namespace 'velero', Total capacity 10 GiB Current status: Initialized MinIO version: minio/minio:RELEASE.2021-10-06T23-36-31Z % kubectl minio tenant info velero -n velero Tenant 'velero', Namespace 'velero', Total capacity 10 GiB Current status: Initialized MinIO version: minio/minio:RELEASE.2021-10-06T23-36-31Z MinIO service: minio/ClusterIP (port 80) Console service: velero-console/ClusterIP (port 9090) +------+---------+--------------------+---------------------+ | POOL | SERVERS | VOLUMES PER SERVER | CAPACITY PER VOLUME | +------+---------+--------------------+---------------------+ | 0 | 4 | 1 | 2684354560 | +------+---------+--------------------+---------------------+
Step 3: Create a bucket for the first Minio tenant
Let’s connect to the MinIO Console of the tenant using the Load Balancer address rather than the port forward for the MinIO Operator used previously. Login with the same Access Key and Secrey Key used previously, which you safely stored. Once logged in, select our new tenant and create an S3 bucket for the Velero backups.
As you can see, there is nothing yet on the dashboard that is of much interest. Click on the bucket icon on the left hand side of the window.
Provide the bucket with a name, in this case velero-backups, and click Save.
The tenant now has a bucket.
Now that the S3 Object Store / backup target is successfully deployed and configured, we can turn our attention to Tanzu Mission Control, and begin the process of setting up Data Protection for the various managed clusters and their objects.
Step 4: Configure Tanzu Mission Control Data Protection
There are 3 steps that need completing in order to setup Tanzu Mission Control with Data Protection. We can summarize these as:
- Create the backup credentials for the S3 Object Store
- Add the S3 Object Store as a backup target
- Enable Data Protection on the cluster(s)
Once we complete these steps, we will be able to take a backup of a cluster, or some cluster objects, via Tanzu Mission Control. Let’s delve in each of these steps in turn.
4.1 Create an S3 Account Credential
To create an Account Credential for the MinIO S3 Object Store, from TMC, select Administration, and in Accounts select Customer provisioned storage credential S3.
Give the credential a name that is easily recognizable. I have added the Access Key and Secret Key provided previously.
The new account credential should now be visible.
4.2 Create a backup Target Location
In TMC, again select Administration, and then Create Target locations. Choose Customer provisioned S3-compatible storage.
Select the account credentials for this object store, which you just created in the previous step.
Next provide a URL for the S3 compatible object store. This is the MinIO Endpoint from the MinIO UI tenant summary in step 2. This is not the Console. Note that it takes a URL, not an IP address.
Decide who can have access to the target. In this example, I am providing access to all of my clusters in the cluster group called cormac-cluster-group.
Give the target location a name and create it.
The TMC Data Protection target location is now created.
Now that we have a target location for backups, we should now be able to setup Data Protection on the clusters.
4.3 Enable Data Protection
To enable Data Protection on a cluster in TMC, select the cluster on which you want to initiate the backup. Then click on “Enable Data protection“. This is available in the lower right hand side of the cluster view. By default, Data protection is not enabled.
This will launch a popup window, as shown below. As it states, this will install Velero on your cluster.
If the operation is successful, it should show that Data Protection is now enabled on your cluster, and the option to Create backup is now available.
Once data protection is enabled you should be able to see 2 new pods in the velero namespace created on the target cluster that is being backed up (not the cluster where MinIO Operator is running). To see these additional pods, you need to switch the Kubernetes configuration context from where MinIO is installed (which is what we were looking at previously) to the target cluster that is being backed-up. A useful tip for troubleshooting in my experience is to follow the logs of the velero pod on the target cluster to make sure everything is working. We will look at the logs again, when we do a backup.
% kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE kubernetes-admin@kubernetes kubernetes kubernetes-admin mgmt-cjh-admin@mgmt-cjh mgmt-cjh mgmt-cjh-admin * minio-admin@minio minio minio-admin tanzu-cli-workload1@workload1 workload1 tanzu-cli-workload1 tanzu-cli-workload2@workload2 workload2 tanzu-cli-workload2 tanzu-cli-workload3a@workload3a workload3a tanzu-cli-workload3a workload1-admin@workload1 workload1 workload1-admin workload2-admin@workload2 workload2 workload2-admin workload3a-admin@workload3a workload3a workload3a-admin % kubectl config use-context workload3a-admin@workload3a Switched to context "workload3a-admin@workload3a". % kubectl get pods -n velero NAME READY STATUS RESTARTS AGE restic-zs92t 1/1 Running 0 54s velero-6f4fb865b4-v8wbx 1/1 Running 0 57s % kubectl logs velero-6f4fb865b4-v8wbx -n velero -f time="2021-10-08T11:49:46Z" level=info msg="setting log-level to INFO" logSource="pkg/cmd/server/server.go:172" time="2021-10-08T11:49:46Z" level=info msg="Starting Velero server v1.6.2 (8c9cdb9603446760452979dc77f93b17054ea1cc)" logSource="pkg/cmd/server/server.go:174" time="2021-10-08T11:49:46Z" level=info msg="No feature flags enabled" logSource="pkg/cmd/server/server.go:178" . . .
Step 5: Take your first backup
For my first backup, I am going to backup a namespace in my cluster which contains a number of object that provide an nginx web server. This shows the namespace, and some of the associated Kubernetes objects in that namespace.
% kubectl get ns NAME STATUS AGE avi-system Active 45h cormac-ns Active 108m default Active 45h kube-node-lease Active 45h kube-public Active 45h kube-system Active 45h pinniped-concierge Active 45h pinniped-supervisor Active 45h registry-creds-system Active 24h tkg-system Active 45h tkg-system-public Active 45h tmc-data-protection Active 24h velero Active 43m vmware-system-tmc Active 45h % kubectl get all -n cormac-ns NAME READY STATUS RESTARTS AGE pod/nginx-deployment-585449566-nv24z 1/1 Running 0 106m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nginx-svc LoadBalancer 10.96.164.65 xx.yy.zz.157 443:31234/TCP,80:31076/TCP 106m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nginx-deployment 1/1 1 1 106m NAME DESIRED CURRENT READY AGE replicaset.apps/nginx-deployment-585449566 1 1 1 106m
And since this is a virtual / load balancer IP address, I can connect to it and see the nginx landing page.
In the Data Protection window in the cluster view, click “Create Backup”. This launches the backup wizard. First step is to decide what to back up. I am only backing up a namespace, so that is what is selected.
Select the target location, which we just created (our Minio S3 object store).
Select a schedule. I am just going to do a one-off backup, right now.
Decide how long you want the backup to be retained. By default it is 30 days.
Give the backup job/schedule a name, and create the backup job.
If everything is working, the backup should complete successfully.
If there are any issues, monitoring the logs of the velero pod as shown earlier. This may reveal any potential configuration issues with the backup target.
% kubectl logs velero-6f4fb865b4-v8wbx -n velero -f time="2021-10-08T11:49:46Z" level=info msg="setting log-level to INFO" logSource="pkg/cmd/server/server.go:172" time="2021-10-08T11:49:46Z" level=info msg="Starting Velero server v1.6.2 (8c9cdb9603446760452979dc77f93b17054ea1cc)" logSource="pkg/cmd/server/server.go:174" time="2021-10-08T11:49:46Z" level=info msg="No feature flags enabled" logSource="pkg/cmd/server/server.go:178" . . . time="2021-10-08T11:59:54Z" level=info msg="Setting up backup log" backup=velero/cluster3-web-server-backup controller=backup logSource="pkg/controller/backup_controller.go:534" time="2021-10-08T11:59:54Z" level=info msg="Setting up backup temp file" backup=velero/cluster3-web-server-backup logSource="pkg/controller/backup_controller.go:556" time="2021-10-08T11:59:54Z" level=info msg="Setting up plugin manager" backup=velero/cluster3-web-server-backup logSource="pkg/controller/backup_controller.go:563" time="2021-10-08T11:59:54Z" level=info msg="Getting backup item actions" backup=velero/cluster3-web-server-backup logSource="pkg/controller/backup_controller.go:567" time="2021-10-08T11:59:54Z" level=info msg="Setting up backup store to check for backup existence" backup=velero/cluster3-web-server-backup logSource="pkg/controller/backup_controller.go:573" time="2021-10-08T11:59:54Z" level=info msg="Writing backup version file" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:212" time="2021-10-08T11:59:54Z" level=info msg="Including namespaces: cormac-ns" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:218" time="2021-10-08T11:59:54Z" level=info msg="Excluding namespaces: <none>" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:219" time="2021-10-08T11:59:54Z" level=info msg="Including resources: *" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:222" time="2021-10-08T11:59:54Z" level=info msg="Excluding resources: <none>" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:223" time="2021-10-08T11:59:54Z" level=info msg="Backing up all pod volumes using restic: false" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:224" . . . time="2021-10-08T12:00:03Z" level=info msg="Processing item" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:354" name=nginx-svc-9cbg2 namespace=cormac-ns progress= resource=endpointslices.discovery.k8s.io time="2021-10-08T12:00:03Z" level=info msg="Backing up item" backup=velero/cluster3-web-server-backup logSource="pkg/backup/item_backupper.go:121" name=nginx-svc-9cbg2 namespace=cormac-ns resource=endpointslices.discovery.k8s.io time="2021-10-08T12:00:03Z" level=info msg="Backed up 11 items out of an estimated total of 11 (estimate will change throughout the backup)" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:394" name=nginx-svc-9cbg2 namespace=cormac-ns progress= resource=endpointslices.discovery.k8s.io time="2021-10-08T12:00:03Z" level=info msg="Backed up a total of 11 items" backup=velero/cluster3-web-server-backup logSource="pkg/backup/backup.go:419" progress= time="2021-10-08T12:00:03Z" level=info msg="Setting up backup store to persist the backup" backup=velero/cluster3-web-server-backup logSource="pkg/controller/backup_controller.go:654" time="2021-10-08T12:00:03Z" level=info msg="Backup completed" backup=velero/cluster3-web-server-backup controller=backup logSource="pkg/controller/backup_controller.go:664" . .
Step 6: Check backup contents in MinIO S3 Object Store
If we now logon to the MinIO console for our tenant (see step 2), the tenant dashboard should show some more interesting information now that we have sent it some backup contents.
And if we look at the velero backup bucket, we can see that it has some objects stored in it.
These objects are of course coming from the velero (TMC Data Protection) backup job, and are the contents of the namespace where Nginx was running.
Step 7: Restore to ensure everything is working
No backup should be deemed good unless it can be successfully restored. To do a real-life restore of the backup, let’s delete the namespace that I just backed up. Note that TMC is going to recreate the namespace because it was created via TMC in the first place, but the recreated namespace will be empty. I will then restore the backup contents from my MinIO S3 bucket, and fincally check if the nginx web server app is working once again.
% kubectl get ns NAME STATUS AGE avi-system Active 45h cormac-ns Active 118m default Active 45h kube-node-lease Active 45h kube-public Active 45h kube-system Active 45h pinniped-concierge Active 45h pinniped-supervisor Active 45h registry-creds-system Active 24h tkg-system Active 45h tkg-system-public Active 45h tmc-data-protection Active 24h velero Active 43m vmware-system-tmc Active 45h % kubectl get all -n cormac-ns NAME READY STATUS RESTARTS AGE pod/nginx-deployment-585449566-nv24z 1/1 Running 0 106m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nginx-svc LoadBalancer 10.96.164.65 xx.yy.zz.157 443:31234/TCP,80:31076/TCP 106m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nginx-deployment 1/1 1 1 106m NAME DESIRED CURRENT READY AGE replicaset.apps/nginx-deployment-585449566 1 1 1 106m % kubectl delete ns cormac-ns namespace "cormac-ns" deleted % kubectl get all -n cormac-ns No resources found in cormac-ns namespace. % kubectl get ns NAME STATUS AGE avi-system Active 45h cormac-ns Active 9s <<< namespace is recreated, but it is empty default Active 45h kube-node-lease Active 45h kube-public Active 45h kube-system Active 45h pinniped-concierge Active 45h pinniped-supervisor Active 45h registry-creds-system Active 24h tkg-system Active 45h tkg-system-public Active 45h tmc-data-protection Active 24h velero Active 43m vmware-system-tmc Active 45h % kubectl get all -n cormac-ns No resources found in cormac-ns namespace.
With my namespace now empty, and my web server down, let’s do a restore. In the cluster where the backup was initiated, select the backup. Then click on the RESTORE button/link.
Decide what you want to restore. I want to restore everything in that backup – the entire backup job. Note the option to restore selected namespace, which I mentioned in the beginning of this post. This is a new feature that allows you to restore individual namespaces from a full cluster backup.
Give the restore job a name and start the restore process.
You should hopefully observe a successful restore.
Let’s check the namespace to see if the contents were restored. Looks good.
% kubectl get ns NAME STATUS AGE avi-system Active 45h cormac-ns Active 7m56s default Active 45h kube-node-lease Active 45h kube-public Active 45h kube-system Active 45h pinniped-concierge Active 45h pinniped-supervisor Active 45h registry-creds-system Active 24h tkg-system Active 45h tkg-system-public Active 45h tmc-data-protection Active 24h velero Active 51m vmware-system-tmc Active 45h % kubectl get all -n cormac-ns NAME READY STATUS RESTARTS AGE pod/nginx-deployment-585449566-nv24z 1/1 Running 0 61s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nginx-svc LoadBalancer 10.96.83.165 xx.yy.zz.157 443:30368/TCP,80:32025/TCP 62s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nginx-deployment 1/1 1 1 62s NAME DESIRED CURRENT READY AGE replicaset.apps/nginx-deployment-585449566 1 1 1 62s
And is the application working once again? – yes it is!
That completes the overview of using Data Protection (through Velero) on Tanzu Mission Control, using an on-premises MinIO S3 Object Store running in a Tanzu Kubernetes cluster as the backup target/destination.