vSphere with Tanzu backup/restore with Velero vSphere Operator

Last week, I posted about the vSAN Data Persistence platform (DPp), a feature that is now available with VMware Cloud Foundation (VCF) 4.2. In that article, I went through the setup of the Minio Operator in vSphere with Tanzu, and then we created a Minio Tenant with its own S3 Object Store. In other words, we were able to assign an on-premises S3 Object Store to a vSphere with Tanzu namespace in just a few clicks, which was pretty cool.

Now, one of the other Supervisor services that is available with VCF 4.2 is the Velero vSphere Operator. Many of you will already be familiar with Velero, as it is a very popular backup and migration tool for Kubernetes. The purpose of this Velero vSphere Operator is to provide the ability to backup and restore Kubernetes objects in vSphere with Tanzu, both in the Supervisor Cluster and in the Guest Clusters. Of course, those backups needs to be stored somewhere, so why not store them on a newly created on-premised S3 Object Store from the new DPp Minio Operator? Sounds like a plan!

There are a number of steps involved in setting this up. Let’s outline them here first before getting into the details.

Create a new namespace and Minio Tenant for the S3 Object store. The enabling of the Minio operator won’t be shown as it will be the same steps as outlined in my previous post. One important point to note is that we will be disabling TLS in this deployment, so you will need to used the Advanced Configuration setting when creating the Tenant.
Deploy the Velero vSphere Operator. This automatically creates its own namespace.
Download the velero-vsphere installer. This is similar to the standard Velero installer in appearance, but has been written specifically to work with vSphere with Tanzu.
Download the Velero client to your desktop so that backup jobs can be initiated. This is the standard Velero binary.
Create a Velero installer script, which defines items such as the S3 URL and S3 login credentials, among other things. This is to avoid a really long command line syntax.
Run the Velero installer script. This step deploys the Velero server and backup driver, and provides the ability to backup/restore objects in vSphere with Tanzu.
Test a backup and restore using the Velero client.

Let’s now get into the specific steps in greater detail. I will skip step 1 as I will assume that an S3 object store is available in a namespace for me to use as a backup destination. There are two things to keep in mind. The first is that the Velero vSphere Operator does not have support for certificates in this initial version, so we need to disable TLS. To do this, you need to enter the Advanced Mode during Tenant Creation. This is a sample configuration to show where the Advanced Mode option is located.

To disable any reliance on TLS and certificates, when you get to the Security section, un-select the Enable TLS checkbox. We are working at including certificate support in the Velero vSphere Operator going forward.

One other item to keep in mind when you are deploying these images from a custom registry or air-gapped registry is that you will need to provide this information again in the Tenant setup. If you previously deployed the Minio Operator from an air-gapped registry, the registry settings do not automatically apply to the Tenant. Instead, you will need to provide the custom images and custom image registry details via the Advanced Mode as well. Note that this step is optional, and if not populated, the images are automatically pulled from the external docker hub by default. Here is what this screen should look like:

Finally, after setting up the Tenant, you should also take a note of the access key and key id credentials when the Tenant is created. I have deployed the S3 Minio object store (Tenant) is a namespace called velero-ns in this example. If the Minio Tenant has deployed successfully, it’s objects should now be visible, as follows, in the velero-ns namespace:

$ kubectl config get-contexts
CURRENT   NAME                       CLUSTER    AUTHINFO                                   NAMESPACE
*         20.0.0.1                   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
          cormac-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
          minio-domain-c8            20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   minio-domain-c8
          velero-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   velero-ns


$ kubectl config use-context velero-ns
Switched to context "velero-ns".


$ kubectl get all
NAME                                  READY   STATUS    RESTARTS   AGE
pod/velero-console-779694f649-d66hc   1/1     Running   0          13h
pod/velero-zone-0-0                   1/1     Running   0          13h
pod/velero-zone-0-1                   1/1     Running   0          13h
pod/velero-zone-0-2                   1/1     Running   0          13h
pod/velero-zone-0-3                   1/1     Running   0          13h

NAME                     TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
service/minio            LoadBalancer   10.96.1.113   20.0.0.7      443:30942/TCP                   13h
service/velero-console   LoadBalancer   10.96.3.46    20.0.0.5      9090:31166/TCP,9443:30217/TCP   13h
service/velero-hl        ClusterIP      None          <none>        9000/TCP                        13h

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/velero-console   1/1     1            1           13h

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/velero-console-779694f649   1         1         1       13h

NAME                             READY   AGE
statefulset.apps/velero-zone-0   4/4     13h

Enable the Velero vSphere Operator

Now that we have an S3 object store available as a backup destination, we can turn our attention to the Velero vSphere Operator. To enable the Velero vSphere Operator, navigate to Cluster > Supervisor Services > Services. Velero is one of the new Supervisor services now available. Select it, and click on the ENABLE link.

The first piece of information that you may need to populate, once again, is a repository where the Velero images are stored. This can be left blank, and the images will be pulled from docker hub using anonymous credentials (same as Minio Operator). However, because of the recent rate limiting implemented by docker, you may want to login to docker hub using your own credentials, pull the images manually, and store them in a local repository. It depends how often you will be doing this task I suppose. Since I have pulled the images from docker hub and pushed them up to my local Harbor repository, I’ve provided this information during the Velero vSphere Operator enable step.

Next, agree to the terms of the EULA.

This will proceed with creating a new namespace, in my case velero-vsphere-domain-c8. What you might find interesting at this point is that there are no Pod VMs created in the vSphere UI, even though 4 Pods are instantiated in the vSphere with Tanzu Supervisor cluster. The reason for this is because the Velero vSphere Operator Pods are deployed with the control plane node selector, so that they are run on the control plane, not on the worker nodes. For example, here is a list of all the nodes (control plane and workers) from my vSphere with Tanzu environment. When I display the Pods for the Velero vSphere Operator, note that they are all running on control plane nodes. Because of this, they do not appear in the vSphere UI as Pod VMs. This was behaviour that I was unaware of until I began to deploy this operator. You might well ask why this is the case. The answer is quite simple – the control plane nodes have access to the vSphere management network and can communicate with vCenter server. Pods deployed on the control plane nodes may also be given access to this network, whereas ordinary Pod VMs deployed on worker nodes only have access to the networks that were allocated to them as part of creating the namespace in which they reside.

$ kubectl get nodes
NAME                               STATUS   ROLES    AGE   VERSION
42163b88513505fa695d51fa2e2aa1f0   Ready    master   10d   v1.18.2-6+38ac483e736488
421642a81662a903edcbeef9e388b75e   Ready    master   10d   v1.18.2-6+38ac483e736488
42165fe73f7731599e9a8f75e27aefc3   Ready    master   10d   v1.18.2-6+38ac483e736488
esxi-dell-m.rainpole.com           Ready    agent    10d   v1.18.2-sph-83e7e60
esxi-dell-n.rainpole.com           Ready    agent    10d   v1.18.2-sph-83e7e60
esxi-dell-o.rainpole.com           Ready    agent    10d   v1.18.2-sph-83e7e60
esxi-dell-p.rainpole.com           Ready    agent    10d   v1.18.2-sph-83e7e60

$ kubectl config use-context velero-vsphere-domain-c8
Switched to context "velero-vsphere-domain-c8".

$ kubectl get pods -o wide
NAME                                               READY   STATUS    RESTARTS   AGE   IP           NODE                               NOMINATED NODE   READINESS GATES
velero-vsphere-operator-5fd45db6b6-4ncv7           1/1     Running   0          30h   10.244.1.5   42165fe73f7731599e9a8f75e27aefc3   <none>           <none>
velero-vsphere-operator-webhook-79ffbdcd69-dcsqf   1/1     Running   0          30h   10.244.1.5   42165fe73f7731599e9a8f75e27aefc3   <none>           <none>
velero-vsphere-operator-webhook-79ffbdcd69-h5v2q   1/1     Running   0          30h   10.244.1.4   42163b88513505fa695d51fa2e2aa1f0   <none>           <none>
velero-vsphere-operator-webhook-79ffbdcd69-xz86q   1/1     Running   0          30h   10.244.1.2   421642a81662a903edcbeef9e388b75e   <none>           <none>

Download the Velero Client and vSphere Operator CLI

The Velero Client (version 1.5.3 at the time of writing) is available here. The Velero vSphere Operator (version 1.1.0 at the time of writing) is available here. Download both, extract them and optionally add them to your $PATH. One additional step is to configure the context for the client to work in (by default, it expects the namespace velero). Since my namespace is velero-ns, I have to tell the client to use that context.

$ velero client config set namespace=velero-ns

$ velero client config get
namespace: velero-ns

Create the Velero Operator CLI installer script

Whilst you could type the installer options as one command at the command line, you may find it is far easier to put it in a script since there are a lot of parameters to be added. Here is the script that I created for my installation.

#!/bin/sh

NAMESPACE="velero-ns"
BUCKET="backup-bucket"
REGION=minio
S3URL="http://20.0.0.5/"
PublicURL="http://20.0.0.5/"
VELERO_IMAGE=20.0.0.2/cormac-ns/velero:v1.5.1
VSPHERE_PLUGIN=20.0.0.2/cormac-ns/velero-plugin-for-vsphere:1.1.0
AWS_PLUGIN=20.0.0.2/cormac-ns/velero-plugin-for-aws:v1.1.0

./velero-vsphere install \
    --namespace $NAMESPACE \
    --image $VELERO_IMAGE \
    --use-private-registry \
    --provider aws \
    --plugins $AWS_PLUGIN,$VSPHERE_PLUGIN \
    --bucket $BUCKET \
    --secret-file ./velero-minio-credentials \
    --snapshot-location-config region=$REGION \
    --backup-location-config region=$REGION,s3ForcePathStyle="true",s3Url=$S3URL,publicUrl=$PublicURL

Let’s go through some of the variables first.

NAMESPACE – The namespace where the Velero server components will be installed. I am placing them in the same namespace as the S3 object store, away from all of my application namespaces to make things easier.
BUCKET – A bucket that I have created on my S3 object store for my backups. This can be created from the Minio S3 browser client or the Minio S3 console.
REGION – Not sure, but possible related to path to bucket on S3 object store. Defaults to minio. I’ve never seen it set to anything else.
S3URL – How Velero backup/restore operations should access the S3 object store. I’m using the external address of the object store here.
PublicURL – How Velero client should access the S3 object store, e.g. for viewing logs. Again, I’m using the external address of the object store here.
VELERO_IMAGE – The Velero Server Image. Set here because I am not pulling images from an external docker hub, but from my own on-premises Harbor repository. I have already pulled and pushed this image to my local registry. If you are pulling the image from default docker registry, this setting is not needed.
VSPHERE_PLUGIN – A plugin that enables snapshots of Persistent Volumes using VADP. Already pushed to on-premised registry. If you are pulling the image from default docker registry, this setting is not needed.
AWS_PLUGIN – A plugin that knows how to utilize an S3 Object Store destination. Already pushed to on-premised registry. If you are pulling the image from default docker registry, this setting is not needed.

Let’s now look at the install command. I think there are only 2 items that might be different from a standard Velero install.

–use-private-registry – since I have already deployed the Velero vSphere Operator using images from my private (Harbor) registry, I had to already provide my repository credentials here. There is no way to provide credentials to the repository via the velero-vsphere binary, so this is a flag which tells the installer that it can use the same repository without me having to add credentials once more, since these images are being pulled from the same registry.
–secret-file – contains the credentials to the Minio S3 Object Store. You will have noted these down when you built the Minio S3 Tenant earlier, e.g.

$ cat velero-minio-credentials
[default]
aws_access_key_id = XWQ1TB0UJ3ZE4FMD
aws_secret_access_key = JQZ0L5O4ZEKBFYZAZ25M4LA5ZIU2UKFW

More details and other configuration settings can be found in the official documentation here. but with the script in place, I can run it and check to make sure that the Pods (marked in blue) are stood up in the selected namespace, in this case velero-ns. Note that this install only has to be done once. The velero client can then be used to backup applications in any of the namespaces in the vSphere with Tanzu environment.

$ ./install-on-velero-ns-from-harbor.sh
Send the request to the operator about installing Velero in namespace velero-ns


$ kubectl config get-contexts
CURRENT   NAME                       CLUSTER    AUTHINFO                                   NAMESPACE
          20.0.0.1                   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
          cormac-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
          minio-domain-c8            20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   minio-domain-c8
*         velero-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   velero-ns
          velero-vsphere-domain-c8   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   velero-vsphere-domain-c8


$ kubectl get all
NAME                                  READY   STATUS    RESTARTS   AGE
pod/backup-driver-6765cf5cb7-7dv8l    1/1     Running   0          2m3s
pod/velero-7bd5bcf869-c6mk9           1/1     Running   0          2m9s
pod/velero-console-779694f649-d66hc   1/1     Running   0          13h
pod/velero-zone-0-0                   1/1     Running   0          13h
pod/velero-zone-0-1                   1/1     Running   0          13h
pod/velero-zone-0-2                   1/1     Running   0          13h
pod/velero-zone-0-3                   1/1     Running   0          13h

NAME                     TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
service/minio            LoadBalancer   10.96.1.113   20.0.0.7      443:30942/TCP                   13h
service/velero-console   LoadBalancer   10.96.3.46    20.0.0.5      9090:31166/TCP,9443:30217/TCP   13h
service/velero-hl        ClusterIP      None          <none>        9000/TCP                        13h

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/backup-driver    1/1     1            1           2m3s
deployment.apps/velero           1/1     1            1           2m9s
deployment.apps/velero-console   1/1     1            1           13h

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/backup-driver-6765cf5cb7    1         1         1       2m3s
replicaset.apps/velero-7bd5bcf869           1         1         1       2m9s
replicaset.apps/velero-console-779694f649   1         1         1       13h

NAME                             READY   AGE
statefulset.apps/velero-zone-0   4/4     13h

This looks good. Note again that there will be a difference in what is shown in the kubectl output shown here and the vSphere UI. There are 2 Pods shown above (velero and backup-driver), but only one appears in the vSphere UI. The backup-driver Pod is deployed to the control plane nodes, so it will not show up in the UI. The velero Pod runs on the worker nodes, so it will appear as a Pod VM in the UI.

Once this step has completed, you can also check the status of the client and server / operator.

 $ velero version
Client:
        Version: v1.5.3
        Git commit: 123109a3bcac11dbb6783d2758207bac0d0817cb
Server:
        Version: v1.5.1

Checking the status of the service would also be a good thing to verify at this point.

$ kubectl -n velero-ns get veleroservice default -o json | jq '.status'
{
  "enabled": true,
  "installphase": "Completed",
  "version": "v1.5.1"
}

Take a simple backup using the Velero client

Last step in the sequence is to ensure we can take a backup. Let’s do that next. First of all though, check the context of your Velero client once more, and ensure it is running against the namespace where Velero was installed.

$ velero client config get 
namespace: velero-ns

Next, I am going to change context to my application namespace, called cormac-ns. Here I have a Cassandra NoSQL DB stateful set, and also an Nginx deployment running a web server (no data being backed up). Note that this is an application deployed on Pod VMs running on my vSphere with Tanzu Supervisor cluster.

$ kubectl config get-contexts
CURRENT   NAME                       CLUSTER    AUTHINFO                                   NAMESPACE
          20.0.0.1                   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local
          cormac-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   cormac-ns
          minio-domain-c8            20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   minio-domain-c8
*         velero-ns                  20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   velero-ns
          velero-vsphere-domain-c8   20.0.0.1   wcp:20.0.0.1:administrator@vsphere.local   velero-vsphere-domain-c8


$ kubectl config use-context cormac-ns
Switched to context "cormac-ns".


$ kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
cassandra-0                        1/1     Running   0          10d
cassandra-1                        1/1     Running   0          10d
cassandra-2                        1/1     Running   0          10d
nginx-deployment-b4d6b7cf8-cfftz   1/1     Running   0          10d
nginx-deployment-b4d6b7cf8-q29lr   1/1     Running   0          10d
nginx-deployment-b4d6b7cf8-zmfhd   1/1     Running   0          10d

I am now going to take a backup of my Nginx deployment. Before that, let’s take a look at my S3 object store. So far, it is empty. Let’s see if we can get some backup information stored to it.

I am going to use the Pod labels as a way of selecting what I want Velero to backup. First, list the labels on the Pods. Note the Nginx Pods have app=nginx labels, so I will use that.

$ kubectl get pods --show-labels
NAME                               READY   STATUS    RESTARTS   AGE   LABELS
cassandra-0                        1/1     Running   0          10d   app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-0
cassandra-1                        1/1     Running   0          10d   app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-1
cassandra-2                        1/1     Running   0          10d   app=cassandra,controller-revision-hash=cassandra-54d8d8874f,statefulset.kubernetes.io/pod-name=cassandra-2
nginx-deployment-b4d6b7cf8-cfftz   1/1     Running   0          10d   app=nginx,pod-template-hash=b4d6b7cf8
nginx-deployment-b4d6b7cf8-q29lr   1/1     Running   0          10d   app=nginx,pod-template-hash=b4d6b7cf8
nginx-deployment-b4d6b7cf8-zmfhd   1/1     Running   0          10d   app=nginx,pod-template-hash=b4d6b7cf8

Now we are ready to take the backup.

$ velero backup get


$ velero backup create nginx-backup --selector app=nginx
Backup request "nginx-backup" submitted successfully.
Run `velero backup describe nginx-backup` or `velero backup logs nginx-backup` for more details.


$ velero backup get
NAME           STATUS       ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
nginx-backup   InProgress   0        0          2021-02-23 17:53:31 +0000 GMT   30d       default            app=nginx


$ velero backup describe nginx-backup
Name:         nginx-backup
Namespace:    velero-ns
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.2-6+38ac483e736488
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=18+

Phase:  InProgress

Errors:    0
Warnings:  0

Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  app=nginx

Storage Location:  default

Velero-Native Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2021-02-23 17:53:31 +0000 GMT
Completed:  <n/a>

Expiration:  2021-03-25 17:53:31 +0000 GMT

Velero-Native Snapshots: <none included>

And hopefully after a moment or two, since there isn’t a lot of data to transfer in this backup, we should see it complete.

$ velero backup get
NAME           STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
nginx-backup   Completed   0        0          2021-02-23 17:53:31 +0000 GMT   30d       default            app=nginx

The logs from the backup can be displayed using the command velero backup logs nginx-backup, but we are more interested in seeing if anything got transferred to the S3 Object Store.

Looks good to me. If you want to see more details about the backup, you can run the following command and it will show the resources that were captured in the backup:

$ velero backup describe nginx-backup --details
Name:         nginx-backup
Namespace:    velero-ns
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.2-6+38ac483e736488
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=18+

Phase:  Completed

Errors:    0
Warnings:  0

Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  app=nginx

Storage Location:  default

Velero-Native Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2021-02-23 17:53:31 +0000 GMT
Completed:  2021-02-23 17:53:46 +0000 GMT

Expiration:  2021-03-25 17:53:31 +0000 GMT

Total items to be backed up:  7
Items backed up:              7

Resource List:
  apps/v1/Deployment:
    - cormac-ns/nginx-deployment
  apps/v1/ReplicaSet:
    - cormac-ns/nginx-deployment-b4d6b7cf8
  v1/Endpoints:
    - cormac-ns/nginx-svc
  v1/Pod:
    - cormac-ns/nginx-deployment-b4d6b7cf8-cfftz
    - cormac-ns/nginx-deployment-b4d6b7cf8-q29lr
    - cormac-ns/nginx-deployment-b4d6b7cf8-zmfhd
  v1/Service:
    - cormac-ns/nginx-svc

Velero-Native Snapshots: <none included>

Restore

There is not much point having a backup if I cannot restore it. I am going to now remove the Nginx deployment and service from cormac-ns namespace, and restore it using Velero.

$ kubectl config use-context cormac-ns
Switched to context "cormac-ns".


$ kubectl get all
NAME                                   READY   STATUS    RESTARTS   AGE
pod/cassandra-0                        1/1     Running   0          10d
pod/cassandra-1                        1/1     Running   0          10d
pod/cassandra-2                        1/1     Running   0          10d
pod/nginx-deployment-b4d6b7cf8-cfftz   1/1     Running   0          10d
pod/nginx-deployment-b4d6b7cf8-q29lr   1/1     Running   0          10d
pod/nginx-deployment-b4d6b7cf8-zmfhd   1/1     Running   0          10d


NAME                                                    TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                      AGE
service/cassandra                                       ClusterIP      10.96.3.50    <none>        9042/TCP                     27d
service/nginx-svc                                       LoadBalancer   10.96.1.252   20.0.0.4      443:32662/TCP,80:31832/TCP   27d
service/tkg-cluster-vcf-w-tanzu-control-plane-service   LoadBalancer   10.96.0.31    20.0.0.3      6443:31677/TCP               71d


NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nginx-deployment   3/3     3            3           27d


NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-deployment-b4d6b7cf8   3         3         3       27d


NAME                         READY   AGE
statefulset.apps/cassandra   3/3     27d


$ kubectl delete deployment.apps/nginx-deployment
deployment.apps "nginx-deployment” deleted


$ kubectl delete service/nginx-svc
service "nginx-svc" deleted


$ kubectl get all
NAME              READY   STATUS    RESTARTS   AGE
pod/cassandra-0   1/1     Running   0          10d
pod/cassandra-1   1/1     Running   0          10d
pod/cassandra-2   1/1     Running   0          10d

NAME                                                    TYPE           CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE
service/cassandra                                       ClusterIP      10.96.3.50   <none>        9042/TCP         27d
service/tkg-cluster-vcf-w-tanzu-control-plane-service   LoadBalancer   10.96.0.31   20.0.0.3      6443:31677/TCP   71d

NAME                         READY   AGE
statefulset.apps/cassandra   3/3     27d


$ velero backup get
NAME                    STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
nginx-backup            Completed   0        0          2021-02-23 17:53:31 +0000 GMT   29d       default            app=nginx


$ velero restore create restore-nginx --from-backup nginx-backup
Restore request "restore-nginx" submitted successfully.
Run `velero restore describe restore-nginx` or `velero restore logs restore-nginx` for more details.


$ velero restore describe restore-nginx
Name:         restore-nginx
Namespace:    velero-ns
Labels:       <none>
Annotations:  <none>

Phase:  Completed

Started:    2021-02-24 09:25:59 +0000 GMT
Completed:  2021-02-24 09:25:59 +0000 GMT

Backup:  nginx-backup

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
  Cluster-scoped:  auto

Namespace mappings:  <none>

Label selector:  <none>

Restore PVs:  auto


$ kubectl get all
NAME                                   READY   STATUS    RESTARTS   AGE
pod/cassandra-0                        1/1     Running   0          10d
pod/cassandra-1                        1/1     Running   0          10d
pod/cassandra-2                        1/1     Running   0          10d
pod/nginx-deployment-b4d6b7cf8-cfftz   1/1     Running   0          10s
pod/nginx-deployment-b4d6b7cf8-q29lr   1/1     Running   0          10s
pod/nginx-deployment-b4d6b7cf8-zmfhd   1/1     Running   0          10s

NAME                                                    TYPE           CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
service/cassandra                                       ClusterIP      10.96.3.50   <none>        9042/TCP                     27d
service/nginx-svc                                       LoadBalancer   10.96.3.65   20.0.0.7      443:32575/TCP,80:30197/TCP   10s
service/tkg-cluster-vcf-w-tanzu-control-plane-service   LoadBalancer   10.96.0.31   20.0.0.3      6443:31677/TCP               71d

NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nginx-deployment   3/3     3            3           10s

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-deployment-b4d6b7cf8   3         3         3       10s

NAME                         READY   AGE
statefulset.apps/cassandra   3/3     27d

The restore appears to have been successful. The deployment, replicaset, service and pods have all been restored, although the nginx service has been restarted on a new Load Balancer IP address (which is to be expected – highlighted in blue above). The other thing to note is that there were no nginx images backed up or restored in this example. The images were pulled from the same registry as per the manifests which is what was backed up and restored.

So this is very simple backup and restore just to verify the functionality. I need to spend some more cyles on this setup and try some additional steps, such as restoring to another namespace, and backing up and restoring applications that use Persistent Volumes, as well as TKG “guest” clusters of course. But that is enough for the moment.

Conclusion

Well, that looks pretty nice to me. We have the Minio S3 Object Store (provisioned from vSAN Data Persistence platform – DPp), providing us with a destination that can be used for storing our backups of Supervisor cluster objects taken via the Velero vSphere Operator. These services are running hand-in-hand, side-by-side in the vSphere with Tanzu (on VCF). Things are starting to come together nicely.

6 Replies to “vSphere with Tanzu backup/restore with Velero vSphere Operator”

kurthv71 says:

March 3, 2021 at 4:23 pm

Hi Cormac,
thanks for this very helpful blog article!
I was able to install MinIO and Velero, and I successfully created a backup of my WordPress demo application.
However, when trying to restore, the process stucks during creation of the PVCs.
Are there any known issues with persistent volume claims?
After the failed restore attempt, I am not able to remove the namespace where the failed restore occured.
It seems, something is blocking the deletion. Any ideas?
Best regards,
Volker
1. Cormac says:
  
  March 4, 2021 at 7:58 am
  
  Hi Volker – have you set the Velero Data Manager? This is needed in order to snapshot and move data to the S3 object store. See my most recent post.
  1. kurthv71 says:
    
    March 4, 2021 at 3:00 pm
    
    Hi Cormac,
    
    I didn’t realize, that you already wrote a follow-up article for stateful pods with PVCs.
    Now I was able to deploy and configure the Velero Data Manager, but restore still isn’t working 🙁
    
    kurthv@VDI-VK:~/k8s-examples/nginx-app$ velero restore get
    NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTOR
    restore-nginx nginx-backup New 0 0 2021-03-04 15:52:25 +0100 CET
    
    kurthv@VDI-VK:~/k8s-examples/nginx-app$ velero describe restore restore-nginx
    Name: restore-nginx
    Namespace: velero-ns
    Labels:
    Annotations:
    
    Phase:
    
    Started:
    Completed:
    
    Backup: nginx-backup
    
    Namespaces:
    Included: all namespaces found in the backup
    Excluded:
    
    Resources:
    Included: *
    Excluded:
    Cluster-scoped: auto
    
    Namespace mappings:
    
    Label selector:
    
    Restore PVs: auto
    
    Any ideas?
  2. kurthv71 says:
    
    March 4, 2021 at 4:41 pm
    
    BTW:
    – For testing purposes I used the nginx-app example “with-pv.yaml” which comes with the velero binaries.
    – I also tried to backup & restore the wordpress demo application from https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/
    
    When I perform a backup of the NGINX application, I can see data uploaded to the plugins directory in the MiniO bucket.
    When I perform a backup of the wordpress application, I don’t see data in the plugins directory – only in the backups directory.
    
    Here you can find the logs:
    https://www.dropbox.com/s/bpq9wapy6tx2xog/velero-logs.tgz?dl=0
    1. Cormac says:
      
      March 5, 2021 at 7:59 am
      
      Hey Volker, I would look at past issues, and perhaps even raise a new issue here – https://github.com/vmware-tanzu/velero-plugin-for-vsphere/issues. This is probably the best way to get assistance.
      1. kurthv71 says:
        
        March 5, 2021 at 3:21 pm
        
        Okay, I will first test with your Cassandra stateful app, and maybe then raise a new issue on GitHub.

Comments are closed.