vSphere with Tanzu backup/restore with Velero vSphere Operator
Last week, I posted about the vSAN Data Persistence platform (DPp), a feature that is now available with VMware Cloud Foundation (VCF) 4.2. In that article, I went through the setup of the Minio Operator in vSphere with Tanzu, and then we created a Minio Tenant with its own S3 Object Store. In other words, we were able to assign an on-premises S3 Object Store to a vSphere with Tanzu namespace in just a few clicks, which was pretty cool.
Now, one of the other Supervisor services that is available with VCF 4.2 is the Velero vSphere Operator. Many of you will already be familiar with Velero, as it is a very popular backup and migration tool for Kubernetes. The purpose of this Velero vSphere Operator is to provide the ability to backup and restore Kubernetes objects in vSphere with Tanzu, both in the Supervisor Cluster and in the Guest Clusters. Of course, those backups needs to be stored somewhere, so why not store them on a newly created on-premised S3 Object Store from the new DPp Minio Operator? Sounds like a plan!
There are a number of steps involved in setting this up. Let’s outline them here first before getting into the details.
- Create a new namespace and Minio Tenant for the S3 Object store. The enabling of the Minio operator won’t be shown as it will be the same steps as outlined in my previous post. One important point to note is that we will be disabling TLS in this deployment, so you will need to used the Advanced Configuration setting when creating the Tenant.
- Deploy the Velero vSphere Operator. This automatically creates its own namespace.
- Download the velero-vsphere installer. This is similar to the standard Velero installer in appearance, but has been written specifically to work with vSphere with Tanzu.
- Download the Velero client to your desktop so that backup jobs can be initiated. This is the standard Velero binary.
- Create a Velero installer script, which defines items such as the S3 URL and S3 login credentials, among other things. This is to avoid a really long command line syntax.
- Run the Velero installer script. This step deploys the Velero server and backup driver, and provides the ability to backup/restore objects in vSphere with Tanzu.
- Test a backup and restore using the Velero client.
Let’s now get into the specific steps in greater detail. I will skip step 1 as I will assume that an S3 object store is available in a namespace for me to use as a backup destination. There are two things to keep in mind. The first is that the Velero vSphere Operator does not have support for certificates in this initial version, so we need to disable TLS. To do this, you need to enter the Advanced Mode during Tenant Creation. This is a sample configuration to show where the Advanced Mode option is located.
To disable any reliance on TLS and certificates, when you get to the Security section, un-select the Enable TLS checkbox. We are working at including certificate support in the Velero vSphere Operator going forward.
One other item to keep in mind when you are deploying these images from a custom registry or air-gapped registry is that you will need to provide this information again in the Tenant setup. If you previously deployed the Minio Operator from an air-gapped registry, the registry settings do not automatically apply to the Tenant. Instead, you will need to provide the custom images and custom image registry details via the Advanced Mode as well. Note that this step is optional, and if not populated, the images are automatically pulled from the external docker hub by default. Here is what this screen should look like:
Finally, after setting up the Tenant, you should also take a note of the access key and key id credentials when the Tenant is created. I have deployed the S3 Minio object store (Tenant) is a namespace called velero-ns in this example. If the Minio Tenant has deployed successfully, it’s objects should now be visible, as follows, in the velero-ns namespace:
$ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE * 20.0.0.1 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns minio-domain-c8 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local minio-domain-c8 velero-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local velero-ns $ kubectl config use-context velero-ns Switched to context "velero-ns". $ kubectl get all NAME READY STATUS RESTARTS AGE pod/velero-console-779694f649-d66hc 1/1 Running 0 13h pod/velero-zone-0-0 1/1 Running 0 13h pod/velero-zone-0-1 1/1 Running 0 13h pod/velero-zone-0-2 1/1 Running 0 13h pod/velero-zone-0-3 1/1 Running 0 13h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/minio LoadBalancer 10.96.1.113 20.0.0.7 443:30942/TCP 13h service/velero-console LoadBalancer 10.96.3.46 20.0.0.5 9090:31166/TCP,9443:30217/TCP 13h service/velero-hl ClusterIP None <none> 9000/TCP 13h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/velero-console 1/1 1 1 13h NAME DESIRED CURRENT READY AGE replicaset.apps/velero-console-779694f649 1 1 1 13h NAME READY AGE statefulset.apps/velero-zone-0 4/4 13h
Enable the Velero vSphere Operator
Now that we have an S3 object store available as a backup destination, we can turn our attention to the Velero vSphere Operator. To enable the Velero vSphere Operator, navigate to Cluster > Supervisor Services > Services. Velero is one of the new Supervisor services now available. Select it, and click on the ENABLE link.
The first piece of information that you may need to populate, once again, is a repository where the Velero images are stored. This can be left blank, and the images will be pulled from docker hub using anonymous credentials (same as Minio Operator). However, because of the recent rate limiting implemented by docker, you may want to login to docker hub using your own credentials, pull the images manually, and store them in a local repository. It depends how often you will be doing this task I suppose. Since I have pulled the images from docker hub and pushed them up to my local Harbor repository, I’ve provided this information during the Velero vSphere Operator enable step.
Next, agree to the terms of the EULA.
This will proceed with creating a new namespace, in my case velero-vsphere-domain-c8. What you might find interesting at this point is that there are no Pod VMs created in the vSphere UI, even though 4 Pods are instantiated in the vSphere with Tanzu Supervisor cluster. The reason for this is because the Velero vSphere Operator Pods are deployed with the control plane node selector, so that they are run on the control plane, not on the worker nodes. For example, here is a list of all the nodes (control plane and workers) from my vSphere with Tanzu environment. When I display the Pods for the Velero vSphere Operator, note that they are all running on control plane nodes. Because of this, they do not appear in the vSphere UI as Pod VMs. This was behaviour that I was unaware of until I began to deploy this operator. You might well ask why this is the case. The answer is quite simple – the control plane nodes have access to the vSphere management network and can communicate with vCenter server. Pods deployed on the control plane nodes may also be given access to this network, whereas ordinary Pod VMs deployed on worker nodes only have access to the networks that were allocated to them as part of creating the namespace in which they reside.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION 42163b88513505fa695d51fa2e2aa1f0 Ready master 10d v1.18.2-6+38ac483e736488 421642a81662a903edcbeef9e388b75e Ready master 10d v1.18.2-6+38ac483e736488 42165fe73f7731599e9a8f75e27aefc3 Ready master 10d v1.18.2-6+38ac483e736488 esxi-dell-m.rainpole.com Ready agent 10d v1.18.2-sph-83e7e60 esxi-dell-n.rainpole.com Ready agent 10d v1.18.2-sph-83e7e60 esxi-dell-o.rainpole.com Ready agent 10d v1.18.2-sph-83e7e60 esxi-dell-p.rainpole.com Ready agent 10d v1.18.2-sph-83e7e60 $ kubectl config use-context velero-vsphere-domain-c8 Switched to context "velero-vsphere-domain-c8". $ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES velero-vsphere-operator-5fd45db6b6-4ncv7 1/1 Running 0 30h 10.244.1.5 42165fe73f7731599e9a8f75e27aefc3 <none> <none> velero-vsphere-operator-webhook-79ffbdcd69-dcsqf 1/1 Running 0 30h 10.244.1.5 42165fe73f7731599e9a8f75e27aefc3 <none> <none> velero-vsphere-operator-webhook-79ffbdcd69-h5v2q 1/1 Running 0 30h 10.244.1.4 42163b88513505fa695d51fa2e2aa1f0 <none> <none> velero-vsphere-operator-webhook-79ffbdcd69-xz86q 1/1 Running 0 30h 10.244.1.2 421642a81662a903edcbeef9e388b75e <none> <none>
Download the Velero Client and vSphere Operator CLI
The Velero Client (version 1.5.3 at the time of writing) is available here. The Velero vSphere Operator (version 1.1.0 at the time of writing) is available here. Download both, extract them and optionally add them to your $PATH. One additional step is to configure the context for the client to work in (by default, it expects the namespace velero). Since my namespace is velero-ns, I have to tell the client to use that context.
$ velero client config set namespace=velero-ns $ velero client config get namespace: velero-ns
Create the Velero Operator CLI installer script
Whilst you could type the installer options as one command at the command line, you may find it is far easier to put it in a script since there are a lot of parameters to be added. Here is the script that I created for my installation.
#!/bin/sh NAMESPACE="velero-ns" BUCKET="backup-bucket" REGION=minio S3URL="http://20.0.0.5/" PublicURL="http://20.0.0.5/" VELERO_IMAGE=20.0.0.2/cormac-ns/velero:v1.5.1 VSPHERE_PLUGIN=20.0.0.2/cormac-ns/velero-plugin-for-vsphere:1.1.0 AWS_PLUGIN=20.0.0.2/cormac-ns/velero-plugin-for-aws:v1.1.0 ./velero-vsphere install \ --namespace $NAMESPACE \ --image $VELERO_IMAGE \ --use-private-registry \ --provider aws \ --plugins $AWS_PLUGIN,$VSPHERE_PLUGIN \ --bucket $BUCKET \ --secret-file ./velero-minio-credentials \ --snapshot-location-config region=$REGION \ --backup-location-config region=$REGION,s3ForcePathStyle="true",s3Url=$S3URL,publicUrl=$PublicURL
Let’s go through some of the variables first.
- NAMESPACE – The namespace where the Velero server components will be installed. I am placing them in the same namespace as the S3 object store, away from all of my application namespaces to make things easier.
- BUCKET – A bucket that I have created on my S3 object store for my backups. This can be created from the Minio S3 browser client or the Minio S3 console.
- REGION – Not sure, but possible related to path to bucket on S3 object store. Defaults to minio. I’ve never seen it set to anything else.
- S3URL – How Velero backup/restore operations should access the S3 object store. I’m using the external address of the object store here.
- PublicURL – How Velero client should access the S3 object store, e.g. for viewing logs. Again, I’m using the external address of the object store here.
- VELERO_IMAGE – The Velero Server Image. Set here because I am not pulling images from an external docker hub, but from my own on-premises Harbor repository. I have already pulled and pushed this image to my local registry. If you are pulling the image from default docker registry, this setting is not needed.
- VSPHERE_PLUGIN – A plugin that enables snapshots of Persistent Volumes using VADP. Already pushed to on-premised registry. If you are pulling the image from default docker registry, this setting is not needed.
- AWS_PLUGIN – A plugin that knows how to utilize an S3 Object Store destination. Already pushed to on-premised registry. If you are pulling the image from default docker registry, this setting is not needed.
Let’s now look at the install command. I think there are only 2 items that might be different from a standard Velero install.
- –use-private-registry – since I have already deployed the Velero vSphere Operator using images from my private (Harbor) registry, I had to already provide my repository credentials here. There is no way to provide credentials to the repository via the velero-vsphere binary, so this is a flag which tells the installer that it can use the same repository without me having to add credentials once more, since these images are being pulled from the same registry.
- –secret-file – contains the credentials to the Minio S3 Object Store. You will have noted these down when you built the Minio S3 Tenant earlier, e.g.
$ cat velero-minio-credentials [default] aws_access_key_id = XWQ1TB0UJ3ZE4FMD aws_secret_access_key = JQZ0L5O4ZEKBFYZAZ25M4LA5ZIU2UKFW
More details and other configuration settings can be found in the official documentation here. but with the script in place, I can run it and check to make sure that the Pods (marked in blue) are stood up in the selected namespace, in this case velero-ns. Note that this install only has to be done once. The velero client can then be used to backup applications in any of the namespaces in the vSphere with Tanzu environment.
$ ./install-on-velero-ns-from-harbor.sh Send the request to the operator about installing Velero in namespace velero-ns $ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 20.0.0.1 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns minio-domain-c8 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local minio-domain-c8 * velero-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local velero-ns velero-vsphere-domain-c8 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local velero-vsphere-domain-c8 $ kubectl get all NAME READY STATUS RESTARTS AGE pod/backup-driver-6765cf5cb7-7dv8l 1/1 Running 0 2m3s pod/velero-7bd5bcf869-c6mk9 1/1 Running 0 2m9s pod/velero-console-779694f649-d66hc 1/1 Running 0 13h pod/velero-zone-0-0 1/1 Running 0 13h pod/velero-zone-0-1 1/1 Running 0 13h pod/velero-zone-0-2 1/1 Running 0 13h pod/velero-zone-0-3 1/1 Running 0 13h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/minio LoadBalancer 10.96.1.113 20.0.0.7 443:30942/TCP 13h service/velero-console LoadBalancer 10.96.3.46 20.0.0.5 9090:31166/TCP,9443:30217/TCP 13h service/velero-hl ClusterIP None <none> 9000/TCP 13h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/backup-driver 1/1 1 1 2m3s deployment.apps/velero 1/1 1 1 2m9s deployment.apps/velero-console 1/1 1 1 13h NAME DESIRED CURRENT READY AGE replicaset.apps/backup-driver-6765cf5cb7 1 1 1 2m3s replicaset.apps/velero-7bd5bcf869 1 1 1 2m9s replicaset.apps/velero-console-779694f649 1 1 1 13h NAME READY AGE statefulset.apps/velero-zone-0 4/4 13h
This looks good. Note again that there will be a difference in what is shown in the kubectl output shown here and the vSphere UI. There are 2 Pods shown above (velero and backup-driver), but only one appears in the vSphere UI. The backup-driver Pod is deployed to the control plane nodes, so it will not show up in the UI. The velero Pod runs on the worker nodes, so it will appear as a Pod VM in the UI.
Once this step has completed, you can also check the status of the client and server / operator.
$ velero version Client: Version: v1.5.3 Git commit: 123109a3bcac11dbb6783d2758207bac0d0817cb Server: Version: v1.5.1
Checking the status of the service would also be a good thing to verify at this point.
$ kubectl -n velero-ns get veleroservice default -o json | jq '.status' { "enabled": true, "installphase": "Completed", "version": "v1.5.1" }
Take a simple backup using the Velero client
Last step in the sequence is to ensure we can take a backup. Let’s do that next. First of all though, check the context of your Velero client once more, and ensure it is running against the namespace where Velero was installed.
$ velero client config get
namespace: velero-ns
Next, I am going to change context to my application namespace, called cormac-ns. Here I have a Cassandra NoSQL DB stateful set, and also an Nginx deployment running a web server (no data being backed up). Note that this is an application deployed on Pod VMs running on my vSphere with Tanzu Supervisor cluster.
$ kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 20.0.0.1 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local cormac-ns minio-domain-c8 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local minio-domain-c8 * velero-ns 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local velero-ns velero-vsphere-domain-c8 20.0.0.1 wcp:20.0.0.1:administrator@vsphere.local velero-vsphere-domain-c8 $ kubectl config use-context cormac-ns Switched to context "cormac-ns". $ kubectl get pods NAME READY STATUS RESTARTS AGE cassandra-0 1/1 Running 0 10d cassandra-1 1/1 Running 0 10d cassandra-2 1/1 Running 0 10d nginx-deployment-b4d6b7cf8-cfftz 1/1 Running 0 10d nginx-deployment-b4d6b7cf8-q29lr 1/1 Running 0 10d nginx-deployment-b4d6b7cf8-zmfhd 1/1 Running 0 10d
I am now going to take a backup of my Nginx deployment. Before that, let’s take a look at my S3 object store. So far, it is empty. Let’s see if we can get some backup information stored to it.
I am going to use the Pod labels as a way of selecting what I want Velero to backup. First, list the labels on the Pods. Note the Nginx Pods have app=nginx labels, so I will use that.
$ kubectl get pods --show-labels NAME READY STATUS RESTARTS AGE LABELS cassandra-0 1/1 Running 0 10d app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-0 cassandra-1 1/1 Running 0 10d app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-1 cassandra-2 1/1 Running 0 10d app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-2 nginx-deployment-b4d6b7cf8-cfftz 1/1 Running 0 10d app=nginx,pod-template-hash=b4d6b7cf8 nginx-deployment-b4d6b7cf8-q29lr 1/1 Running 0 10d app=nginx,pod-template-hash=b4d6b7cf8 nginx-deployment-b4d6b7cf8-zmfhd 1/1 Running 0 10d app=nginx,pod-template-hash=b4d6b7cf8
Now we are ready to take the backup.
$ velero backup get $ velero backup create nginx-backup --selector app=nginx Backup request "nginx-backup" submitted successfully. Run `velero backup describe nginx-backup` or `velero backup logs nginx-backup` for more details. $ velero backup get NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-backup InProgress 0 0 2021-02-23 17:53:31 +0000 GMT 30d default app=nginx $ velero backup describe nginx-backup Name: nginx-backup Namespace: velero-ns Labels: velero.io/storage-location=default Annotations: velero.io/source-cluster-k8s-gitversion=v1.18.2-6+38ac483e736488 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=18+ Phase: InProgress Errors: 0 Warnings: 0 Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: app=nginx Storage Location: default Velero-Native Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2021-02-23 17:53:31 +0000 GMT Completed: <n/a> Expiration: 2021-03-25 17:53:31 +0000 GMT Velero-Native Snapshots: <none included>
And hopefully after a moment or two, since there isn’t a lot of data to transfer in this backup, we should see it complete.
$ velero backup get NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-backup Completed 0 0 2021-02-23 17:53:31 +0000 GMT 30d default app=nginx
The logs from the backup can be displayed using the command velero backup logs nginx-backup, but we are more interested in seeing if anything got transferred to the S3 Object Store.
Looks good to me. If you want to see more details about the backup, you can run the following command and it will show the resources that were captured in the backup:
$ velero backup describe nginx-backup --details Name: nginx-backup Namespace: velero-ns Labels: velero.io/storage-location=default Annotations: velero.io/source-cluster-k8s-gitversion=v1.18.2-6+38ac483e736488 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=18+ Phase: Completed Errors: 0 Warnings: 0 Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: app=nginx Storage Location: default Velero-Native Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2021-02-23 17:53:31 +0000 GMT Completed: 2021-02-23 17:53:46 +0000 GMT Expiration: 2021-03-25 17:53:31 +0000 GMT Total items to be backed up: 7 Items backed up: 7 Resource List: apps/v1/Deployment: - cormac-ns/nginx-deployment apps/v1/ReplicaSet: - cormac-ns/nginx-deployment-b4d6b7cf8 v1/Endpoints: - cormac-ns/nginx-svc v1/Pod: - cormac-ns/nginx-deployment-b4d6b7cf8-cfftz - cormac-ns/nginx-deployment-b4d6b7cf8-q29lr - cormac-ns/nginx-deployment-b4d6b7cf8-zmfhd v1/Service: - cormac-ns/nginx-svc Velero-Native Snapshots: <none included>
Restore
There is not much point having a backup if I cannot restore it. I am going to now remove the Nginx deployment and service from cormac-ns namespace, and restore it using Velero.
$ kubectl config use-context cormac-ns Switched to context "cormac-ns". $ kubectl get all NAME READY STATUS RESTARTS AGE pod/cassandra-0 1/1 Running 0 10d pod/cassandra-1 1/1 Running 0 10d pod/cassandra-2 1/1 Running 0 10d pod/nginx-deployment-b4d6b7cf8-cfftz 1/1 Running 0 10d pod/nginx-deployment-b4d6b7cf8-q29lr 1/1 Running 0 10d pod/nginx-deployment-b4d6b7cf8-zmfhd 1/1 Running 0 10d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cassandra ClusterIP 10.96.3.50 <none> 9042/TCP 27d service/nginx-svc LoadBalancer 10.96.1.252 20.0.0.4 443:32662/TCP,80:31832/TCP 27d service/tkg-cluster-vcf-w-tanzu-control-plane-service LoadBalancer 10.96.0.31 20.0.0.3 6443:31677/TCP 71d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nginx-deployment 3/3 3 3 27d NAME DESIRED CURRENT READY AGE replicaset.apps/nginx-deployment-b4d6b7cf8 3 3 3 27d NAME READY AGE statefulset.apps/cassandra 3/3 27d $ kubectl delete deployment.apps/nginx-deployment deployment.apps "nginx-deployment” deleted $ kubectl delete service/nginx-svc service "nginx-svc" deleted $ kubectl get all NAME READY STATUS RESTARTS AGE pod/cassandra-0 1/1 Running 0 10d pod/cassandra-1 1/1 Running 0 10d pod/cassandra-2 1/1 Running 0 10d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cassandra ClusterIP 10.96.3.50 <none> 9042/TCP 27d service/tkg-cluster-vcf-w-tanzu-control-plane-service LoadBalancer 10.96.0.31 20.0.0.3 6443:31677/TCP 71d NAME READY AGE statefulset.apps/cassandra 3/3 27d $ velero backup get NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-backup Completed 0 0 2021-02-23 17:53:31 +0000 GMT 29d default app=nginx $ velero restore create restore-nginx --from-backup nginx-backup Restore request "restore-nginx" submitted successfully. Run `velero restore describe restore-nginx` or `velero restore logs restore-nginx` for more details. $ velero restore describe restore-nginx Name: restore-nginx Namespace: velero-ns Labels: <none> Annotations: <none> Phase: Completed Started: 2021-02-24 09:25:59 +0000 GMT Completed: 2021-02-24 09:25:59 +0000 GMT Backup: nginx-backup Namespaces: Included: all namespaces found in the backup Excluded: <none> Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: auto Namespace mappings: <none> Label selector: <none> Restore PVs: auto $ kubectl get all NAME READY STATUS RESTARTS AGE pod/cassandra-0 1/1 Running 0 10d pod/cassandra-1 1/1 Running 0 10d pod/cassandra-2 1/1 Running 0 10d pod/nginx-deployment-b4d6b7cf8-cfftz 1/1 Running 0 10s pod/nginx-deployment-b4d6b7cf8-q29lr 1/1 Running 0 10s pod/nginx-deployment-b4d6b7cf8-zmfhd 1/1 Running 0 10s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cassandra ClusterIP 10.96.3.50 <none> 9042/TCP 27d service/nginx-svc LoadBalancer 10.96.3.65 20.0.0.7 443:32575/TCP,80:30197/TCP 10s service/tkg-cluster-vcf-w-tanzu-control-plane-service LoadBalancer 10.96.0.31 20.0.0.3 6443:31677/TCP 71d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nginx-deployment 3/3 3 3 10s NAME DESIRED CURRENT READY AGE replicaset.apps/nginx-deployment-b4d6b7cf8 3 3 3 10s NAME READY AGE statefulset.apps/cassandra 3/3 27d
The restore appears to have been successful. The deployment, replicaset, service and pods have all been restored, although the nginx service has been restarted on a new Load Balancer IP address (which is to be expected – highlighted in blue above). The other thing to note is that there were no nginx images backed up or restored in this example. The images were pulled from the same registry as per the manifests which is what was backed up and restored.
So this is very simple backup and restore just to verify the functionality. I need to spend some more cyles on this setup and try some additional steps, such as restoring to another namespace, and backing up and restoring applications that use Persistent Volumes, as well as TKG “guest” clusters of course. But that is enough for the moment.
Conclusion
Well, that looks pretty nice to me. We have the Minio S3 Object Store (provisioned from vSAN Data Persistence platform – DPp), providing us with a destination that can be used for storing our backups of Supervisor cluster objects taken via the Velero vSphere Operator. These services are running hand-in-hand, side-by-side in the vSphere with Tanzu (on VCF). Things are starting to come together nicely.
Hi Cormac,
thanks for this very helpful blog article!
I was able to install MinIO and Velero, and I successfully created a backup of my WordPress demo application.
However, when trying to restore, the process stucks during creation of the PVCs.
Are there any known issues with persistent volume claims?
After the failed restore attempt, I am not able to remove the namespace where the failed restore occured.
It seems, something is blocking the deletion. Any ideas?
Best regards,
Volker
Hi Volker – have you set the Velero Data Manager? This is needed in order to snapshot and move data to the S3 object store. See my most recent post.
Hi Cormac,
I didn’t realize, that you already wrote a follow-up article for stateful pods with PVCs.
Now I was able to deploy and configure the Velero Data Manager, but restore still isn’t working 🙁
kurthv@VDI-VK:~/k8s-examples/nginx-app$ velero restore get
NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTOR
restore-nginx nginx-backup New 0 0 2021-03-04 15:52:25 +0100 CET
kurthv@VDI-VK:~/k8s-examples/nginx-app$ velero describe restore restore-nginx
Name: restore-nginx
Namespace: velero-ns
Labels:
Annotations:
Phase:
Started:
Completed:
Backup: nginx-backup
Namespaces:
Included: all namespaces found in the backup
Excluded:
Resources:
Included: *
Excluded:
Cluster-scoped: auto
Namespace mappings:
Label selector:
Restore PVs: auto
Any ideas?
BTW:
– For testing purposes I used the nginx-app example “with-pv.yaml” which comes with the velero binaries.
– I also tried to backup & restore the wordpress demo application from https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/
When I perform a backup of the NGINX application, I can see data uploaded to the plugins directory in the MiniO bucket.
When I perform a backup of the wordpress application, I don’t see data in the plugins directory – only in the backups directory.
Here you can find the logs:
https://www.dropbox.com/s/bpq9wapy6tx2xog/velero-logs.tgz?dl=0
Hey Volker, I would look at past issues, and perhaps even raise a new issue here – https://github.com/vmware-tanzu/velero-plugin-for-vsphere/issues. This is probably the best way to get assistance.
Okay, I will first test with your Cassandra stateful app, and maybe then raise a new issue on GitHub.