Setting up Velero 1.0.0 to backup K8s on vSphere/PKS
I have written about Velero a few times on this blog, but I haven’t actually looked at how you would deploy the 1.0.0 version, even though it has been available since May 2019. Someone recently reached out to me for some guidance on how to deploy it, as there are a few subtle differences between previous versions. Therefore I decided to document step-by-step how to do it, but focusing on when your Kubernetes cluster is running on vSphere. I also highlight a gotcha when using Velero to backup applications that are running on Kubernetes deployed via Enterprise PKS, Pivotal Container Service.
To recap, these are the steps that I will cover in detail:
- Download and extract Velero 1.0.0
- Download any required images to local repo if K8s nodes cannot access internet
- Deploy and Configure local Minio S3 Object Store
- Ensure that the PKS tile in Pivotal Ops Manager has the ‘allow privileged containers’ checkbox selected
- Install Velero via velero install – command should include restic support and Minio publicUrl
- Modify hostPath setting in restic DaemonSet for Enterprise PKS
- [New] Create a ConfigMap for the velero-restic-restore-helper
- Run a test Velero backup/restore
Let’s look at each of steps now.
1. Download and extract Velero 1.0.0
The image can be found here – https://github.com/heptio/velero/releases/tag/v1.0.0. Download and extract it, then copy or move the velero binary to somewhere in your PATH.
2. Pull any required images and push them to local repo (e.g. Harbor)
As mentioned in the introduction, this step is only necessary if your Kubernetes nodes do not have access internet. This is the case in my lab, so I do a docker pull, docker tag, docker push to my Harbor repo. For Velero, there are 3 images that need to be handled. There are 2 Minio images, which also requires a modification to the 00-minio-deployment manifest. Below are the before and after of the manifest file.
$ grep image examples/minio/00-minio-deployment.yaml image: minio/minio:latest imagePullPolicy: IfNotPresent image: minio/mc:latest imagePullPolicy: IfNotPresent $ grep image examples/minio/00-minio-deployment.yaml image: harbor.rainpole.com/library/minio:latest imagePullPolicy: IfNotPresent image: harbor.rainpole.com/library/mc:latest imagePullPolicy: IfNotPresent
The third image is referenced during the install. By default, the image used for the Velero and restic server pods comes from “gcr.io/heptio-images/velero:v1.0.0”. We would also need to pull this image and push it to harbor, and then add a –image argument to the velero install to point to the image in my local Harbor repo, which you will see shortly.
3. Deploy and Configure local Minio Object Store
There are a few different steps required here. We have already modified the deployment YAML previously, but only if our images are in a local repo and the K8s nodes have no access to the internet. If they do, then no modification is needed.
3.1 Create a Minio credentials file
A simple credentials file containing the login/password (id/key) for the local on-premises Minio S3 Object Store must be created.
$ cat credentials-velero [default] aws_access_key_id = minio aws_secret_access_key = minio123
3.2 Expose Minio Service on a NodePort
This step is a good idea for 2 reasons. The first is that it gives you a way to access the Minio portal and examine the contents of any backups. The second is that it enables you to specify a publicUrl for Minio, which in turn means that you can access backup and restore logs from the Minio S3 Object Store. To do this, it requires a modification to the 00-minio-deployment manifest:
spec: # ClusterIP is recommended for production environments. # Change to NodePort if needed per documentation, # but only if you run Minio in a test/trial environment, for example with Minikube. type: NodePort
3.3 Deploy Minio
$ kubectl apply -f examples/minio/00-minio-deployment.yaml
3.4 Verify Minio is available on the public URL
If we now go ahead and retrieve the node on which the Minio server is running, as well as the port that it has been exposed on with the changes made in step 3.2, we should be able to verify that Minio is working.
$ kubectl get pods -n velero NAME READY STATUS RESTARTS AGE minio-66dc75bb8d-95xpp 1/1 Running 0 25s minio-setup-zpnfl 0/1 Completed 0 25s $ kubectl describe pod minio-66dc75bb8d-bczf8 -n velero | grep -i Node: Node: 140ab5aa-0159-4612-b68c-df39dbea2245/192.168.192.5 $ kubectl get svc -n velero NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE minio NodePort 10.100.200.82 <none> 9000:32109/TCP 5s

OK. Everything is now in place to allow us to do our velero install.
4. Enable Privileged Containers
To successfully create the restic pods when deploying Velero, you need to enable the checkbox for ‘Allow Privileged’ (which was previously called ‘Enable Privileged Containers‘ in earlier versions of PKS). The checkbox for ‘DenyEscalatingExec’ should also be selected on the PKS plan in Pivotal Ops Manager. You will then need to re-apply the PKS configuration after selecting the checkboxes. For further details on how this setting appeared in previous versions of PKS, and the behaviour when it was not enabled, have a look at part 3 of this earlier blog on installing Velero v0.11 on PKS. It should look something like this on the PKS plan in Pivotal Ops Manager in the current versions of PKS.
5. Install Velero
A big difference in Velero 1.0 is that there is a new velero install command. No more messing around with multiple manifest files that we had in previous versions. Now there are a few things to include in the velero install command. Since there is no vSphere plugin at this time, we rely on a third party plugin called restic. The command line must include and option to use restic. As we also mentioned, we have setup a publicUrl for Minio, so we should also include this in our command line. Finally, because my K8s nodes do not have access to the internet, and thus cannot pull down external images, I have a local Harbor repo where I have already pushed the velero image. You can pull the velero image from gcr.io/heptio-images/velero:v1.0.0. I also need to reference this in the install command. With all those modifications, this is what my install command looks like:
$ velero install --provider aws --bucket velero \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--image harbor.rainpole.com/library/velero:v1.0.0 \
--use-restic \
--backup-location-config \
region=minio,s3ForcePathStyle="true",s3Url=http://minio.velero.svc:9000,publicUrl=http://192.168.192.5:32109
After running the command, the following output is displayed:
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource CustomResourceDefinition/backupstoragelocations.velero.io: created CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource CustomResourceDefinition/serverstatusrequests.velero.io: created CustomResourceDefinition/restores.velero.io: attempting to create resource CustomResourceDefinition/restores.velero.io: created CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource CustomResourceDefinition/podvolumebackups.velero.io: created CustomResourceDefinition/resticrepositories.velero.io: attempting to create resource CustomResourceDefinition/resticrepositories.velero.io: created CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource CustomResourceDefinition/deletebackuprequests.velero.io: created CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource CustomResourceDefinition/podvolumerestores.velero.io: created CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource CustomResourceDefinition/volumesnapshotlocations.velero.io: created CustomResourceDefinition/backups.velero.io: attempting to create resource CustomResourceDefinition/backups.velero.io: created CustomResourceDefinition/schedules.velero.io: attempting to create resource CustomResourceDefinition/schedules.velero.io: created CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource CustomResourceDefinition/downloadrequests.velero.io: created Waiting for resources to be ready in cluster... Namespace/velero: attempting to create resource Namespace/velero: already exists, proceeding Namespace/velero: created ClusterRoleBinding/velero: attempting to create resource ClusterRoleBinding/velero: created ServiceAccount/velero: attempting to create resource ServiceAccount/velero: created Secret/cloud-credentials: attempting to create resource Secret/cloud-credentials: created BackupStorageLocation/default: attempting to create resource BackupStorageLocation/default: created Deployment/velero: attempting to create resource Deployment/velero: created DaemonSet/restic: attempting to create resource DaemonSet/restic: created Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status.
LGTM. I also like the little sailboat in the output (Velero is Spanish for sailboat I believe). Let’s take a look at the logs and make sure everything deployed successfully.
time="2019-08-07T15:02:46Z" level=info msg="setting log-level to INFO" time="2019-08-07T15:02:46Z" level=info msg="Starting Velero server v1.0.0 (72f5cadc3a865019ab9dc043d4952c9bfd5f2ecb)" logSource="pkg/cmd/server/server.go:165" time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=BackupItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/pod time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=BackupItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/pv time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=BackupItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/serviceaccount time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=VolumeSnapshotter logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/aws time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=VolumeSnapshotter logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/azure time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=VolumeSnapshotter logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/gcp time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=ObjectStore logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/aws time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=ObjectStore logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/azure time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=ObjectStore logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/gcp time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/addPVCFromPod time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/addPVFromPVC time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/job time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/pod time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/restic time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/service time="2019-08-07T15:02:46Z" level=info msg="registering plugin" command=/velero kind=RestoreItemAction logSource="pkg/plugin/clientmgmt/registry.go:100" name=velero.io/serviceaccount time="2019-08-07T15:02:46Z" level=info msg="Checking existence of namespace" logSource="pkg/cmd/server/server.go:355" namespace=velero time="2019-08-07T15:02:46Z" level=info msg="Namespace exists" logSource="pkg/cmd/server/server.go:361" namespace=velero time="2019-08-07T15:02:48Z" level=info msg="Checking existence of Velero custom resource definitions" logSource="pkg/cmd/server/server.go:390" time="2019-08-07T15:02:48Z" level=info msg="All Velero custom resource definitions exist" logSource="pkg/cmd/server/server.go:424" time="2019-08-07T15:02:48Z" level=info msg="Checking that all backup storage locations are valid" logSource="pkg/cmd/server/server.go:431" time="2019-08-07T15:02:48Z" level=info msg="Starting controllers" logSource="pkg/cmd/server/server.go:535" time="2019-08-07T15:02:48Z" level=info msg="Starting metric server at address [:8085]" logSource="pkg/cmd/server/server.go:543" time="2019-08-07T15:02:48Z" level=info msg="Server started successfully" logSource="pkg/cmd/server/server.go:788" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=gc-controller logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=gc-controller logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=backup-deletion logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=backup-deletion logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=downloadrequest logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=downloadrequest logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=serverstatusrequest logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=serverstatusrequest logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=backup-sync logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=backup-sync logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=schedule logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=schedule logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=restore logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=restore logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=restic-repository logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=restic-repository logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Starting controller" controller=backup logSource="pkg/controller/generic_controller.go:76" time="2019-08-07T15:02:48Z" level=info msg="Waiting for caches to sync" controller=backup logSource="pkg/controller/generic_controller.go:79" time="2019-08-07T15:02:48Z" level=info msg="Caches are synced" controller=schedule logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=serverstatusrequest logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=gc-controller logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=downloadrequest logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=backup-sync logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=backup logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=restore logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=restic-repository logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Syncing contents of backup store into cluster" backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:170" time="2019-08-07T15:02:49Z" level=info msg="Got backups from backup store" backupCount=0 backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:178" time="2019-08-07T15:02:49Z" level=info msg="Caches are synced" controller=backup-deletion logSource="pkg/controller/generic_controller.go:83" time="2019-08-07T15:02:49Z" level=info msg="Checking for expired DeleteBackupRequests" controller=backup-deletion logSource="pkg/controller/backup_deletion_controller.go:441" time="2019-08-07T15:02:49Z" level=info msg="Done checking for expired DeleteBackupRequests" controller=backup-deletion logSource="pkg/controller/backup_deletion_controller.go:469" time="2019-08-07T15:03:49Z" level=info msg="Syncing contents of backup store into cluster" backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:170" time="2019-08-07T15:03:49Z" level=info msg="Got backups from backup store" backupCount=0 backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:178"
Again, this LGTM. There are no errors in the logs. Looks like we are almost ready to take a backup.
6. Modify hostPath in restic DaemonSet for Enterprise PKS
This step is only necessary for Enterprise PKS deployments, the Pivotal Container Service. This is because the path to the Pods on the Nodes in a PKS deployment is different to what we have in native Kubernetes deployments. If you’ve deployed this on PKS, and you query the status of the Pods in the Velero namespace, you will will notice that the restic Pod have a RunContainerError/CrashLoopBackOff error. Typically the path to Pods on native K8s is /var/lib/kubelet/pods, but on PKS, they are located in /var/vcap/data/kubelet/pods. So this step is to point restic to the correct location of Pods for backup purposes, when K8s is deployed by PKS. First, identify the restic DaemonSet.
$ kubectl get ds --all-namespaces NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system vrops-cadvisor 3 3 3 3 3 <none> 5d3h pks-system fluent-bit 3 3 3 3 3 <none> 5d3h pks-system telegraf 3 3 3 3 3 <none> 5d3h velero restic 3 3 0 3 0 <none> 2m21s
Next, edit the DaemonSet and change the hostPath. The before and after edits are shown below.
$ kubectl edit ds restic -n velero volumes: - hostPath: path: /var/lib/kubelet/pods type: "" name: host-pods volumes: - hostPath: path: /var/vcap/data/kubelet/pods type: "" name: host-pods daemonset.extensions/restic edited
This will terminate and restart the restic Pods. At this point, the velero and restic Pods should all be running. Now we are ready to do a test backup/restore.
7. Create a ConfigMap for the velero-restic-restore-helper
This was a step that I missed on the first version of this post. During a restore of Pods with Persistent Volumes that have been backed up with restic, a temporary pod is instantiated to assist with the restore. This image is pulled from “gcr.io/heptio-images/velero-restic-restore-helper:v1.0.0” by default. Since my nodes dod not have access to the internet, I need to tell velero to get this image from my local repo. This is achieved by creating a ConfigMap with the image location, as per the Customize Restore Helper Image instructions found here. After the usual docker pull/tag/push to get the image into my local Harbor repo, I created and applied the following map with the image location at the end:
$ cat restic-config-map.yaml apiVersion: v1 kind: ConfigMap metadata: # any name can be used; Velero uses the labels (below) # to identify it rather than the name name: restic-restore-action-config # must be in the velero namespace namespace: velero # the below labels should be used verbatim in your # ConfigMap. labels: # this value-less label identifies the ConfigMap as # config for a plugin (i.e. the built-in restic restore # item action plugin) velero.io/plugin-config: "" # this label identifies the name and kind of plugin # that this ConfigMap is for. velero.io/restic: RestoreItemAction data: # "image" is the only configurable key. The value can either # include a tag or not; if the tag is *not* included, the # tag from the main Velero image will automatically be used. image: harbor.rainpole.com/library/velero-restic-restore-helper:v1.0.0 $ kubectl apply -f restic-config-map.yaml configmap/restic-restore-action-config created
This means for any restore that involved restic volumes, the helper can now be successfully pulled. you can now go ahead and check the velero versions of the client and server using the velero version command.
8. Run a test Velero backup/restore
Velero provide a sample nginx application for backup testing. However this once again relies on pulling an nginx image from the internet. If, like me, you are using a local repo, then you will have to do another pull, tag, push and update the sample manifest for the nginx app to get its image from the local repo, e.g.
$ grep image examples/nginx-app/base.yaml - image: harbor.rainpole.com/library/nginx:1.15-alpine
Again, this is only necessary if your nodes to do not have internet access. With that modification in place, you can go ahead and deploy the sample nginx app so we can try to backup and restore it with Velero.
8.1 Deploy sample nginx app
$ kubectl apply -f examples/nginx-app/base.yaml namespace/nginx-example created deployment.apps/nginx-deployment created service/my-nginx created $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE cassandra cassandra-0 1/1 Running 0 23h cassandra cassandra-1 1/1 Running 0 23h cassandra cassandra-2 1/1 Running 0 23h default wavefront-proxy-79568456c6-z82rh 1/1 Running 0 24h kube-system coredns-54586579f6-f7knj 1/1 Running 0 5d3h kube-system coredns-54586579f6-t5r5h 1/1 Running 0 5d3h kube-system coredns-54586579f6-v2cjt 1/1 Running 0 5d3h kube-system kube-state-metrics-86977fd78d-6tb5m 2/2 Running 0 24h kube-system kubernetes-dashboard-6c68548bc9-km8dd 1/1 Running 0 5d3h kube-system metrics-server-5475446b7f-m2fgx 1/1 Running 0 5d3h kube-system vrops-cadvisor-488p8 1/1 Running 0 5d3h kube-system vrops-cadvisor-cdx5w 1/1 Running 0 5d3h kube-system vrops-cadvisor-wgkkl 1/1 Running 0 5d3h nginx-example nginx-deployment-5f8798768c-5jdkn 1/1 Running 0 8s nginx-example nginx-deployment-5f8798768c-lrsw6 1/1 Running 0 8s pks-system cert-generator-v0.19.4-qh6kg 0/1 Completed 0 5d3h pks-system event-controller-5dbd8f48cc-vwpc4 2/2 Running 546 5d3h pks-system fluent-bit-7cx69 3/3 Running 0 5d3h pks-system fluent-bit-fpbl6 3/3 Running 0 5d3h pks-system fluent-bit-j674j 3/3 Running 0 5d3h pks-system metric-controller-5bf6cb67c6-bbh6q 1/1 Running 0 5d3h pks-system observability-manager-5578bbb84f-w87bj 1/1 Running 0 5d3h pks-system sink-controller-54947f5bd9-42spw 1/1 Running 0 5d3h pks-system telegraf-4gv8b 1/1 Running 0 5d3h pks-system telegraf-dtcjc 1/1 Running 0 5d3h pks-system telegraf-m2pjd 1/1 Running 0 5d3h pks-system telemetry-agent-776d45f8d8-c2xhg 1/1 Running 0 5d3h pks-system validator-76fff49f5d-m5t4h 1/1 Running 0 5d3h velero minio-66dc75bb8d-95xpp 1/1 Running 0 11m velero minio-setup-zpnfl 0/1 Completed 0 11m velero restic-7mztz 1/1 Running 0 3m28s velero restic-cxfpt 1/1 Running 0 3m28s velero restic-qx98s 1/1 Running 0 3m28s velero velero-7d97d7ff65-drl5c 1/1 Running 0 9m35s wavefront-collector wavefront-collector-76f7c9fb86-d9pw8 1/1 Running 0 24h $ kubectl get ns NAME STATUS AGE cassandra Active 23h default Active 5d3h kube-public Active 5d3h kube-system Active 5d3h nginx-example Active 4s pks-system Active 5d3h velero Active 9m40s wavefront-collector Active 24h $ kubectl get deployments --namespace=nginx-example NAME READY UP-TO-DATE AVAILABLE AGE nginx-deployment 2/2 2 2 20s $ kubectl get svc --namespace=nginx-example NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx LoadBalancer 10.100.200.147 100.64.0.1,192.168.191.70 80:30942/TCP 32s
This nginx deployment assumes the presence of a LoadBalancer for its Service. Fortunately I do have NSX-T deployed, which provides IP addresses for LoadBalancer services. In the output above, the external IP allocated for the nginx service is 192.168.191.70. If I point a browser to that IP address, I get an nginx landing page.
8.2 First backup
$ velero backup create nginx-backup --selector app=nginx Backup request "nginx-backup" submitted successfully. Run `velero backup describe nginx-backup` or `velero backup logs nginx-backup` for more details. $ velero backup get NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-backup Completed 2019-08-07 16:13:44 +0100 IST 29d default app=nginx
This backup should also be visible in Minio:
8.3 Destroy nginx deployment
Let’s now go ahead and remove the nginx namespace, and then do a restore of our backup. Hopefully our web server will come back afterwards.
$ kubectl get ns NAME STATUS AGE cassandra Active 40h default Active 5d20h kube-public Active 5d20h kube-system Active 5d20h nginx-example Active 17h pks-system Active 5d20h velero Active 17h wavefront-collector Active 41h $ kubectl delete ns nginx-example namespace "nginx-example" deleted $ kubectl get ns NAME STATUS AGE cassandra Active 40h default Active 5d20h kube-public Active 5d20h kube-system Active 5d20h pks-system Active 5d20h velero Active 17h wavefront-collector Active 41h $ kubectl get svc --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra cassandra ClusterIP None <none> 9042/TCP 40h default kubernetes ClusterIP 10.100.200.1 <none> 443/TCP 5d20h default wavefront-proxy ClusterIP 10.100.200.56 <none> 2878/TCP 46h kube-system kube-dns ClusterIP 10.100.200.2 <none> 53/UDP,53/TCP 5d20h kube-system kube-state-metrics ClusterIP 10.100.200.187 <none> 8080/TCP,8081/TCP 41h kube-system kubernetes-dashboard NodePort 10.100.200.160 <none> 443:32485/TCP 5d20h kube-system metrics-server ClusterIP 10.100.200.52 <none> 443/TCP 5d20h pks-system fluent-bit ClusterIP 10.100.200.175 <none> 24224/TCP 5d20h pks-system validator ClusterIP 10.100.200.149 <none> 443/TCP 5d20h velero minio NodePort 10.100.200.82 <none> 9000:32109/TCP 17h
8.4 First restore
Let’s try to restore our backup.
$ velero backup get NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-backup Completed 2019-08-07 16:13:44 +0100 IST 29d default app=nginx $ velero restore create nginx-restore --from-backup nginx-backup Restore request "nginx-restore" submitted successfully. Run `velero restore describe nginx-restore` or `velero restore logs nginx-restore` for more details. $ velero restore describe nginx-restore Name: nginx-restore Namespace: velero Labels: <none> Annotations: <none> Phase: Completed Backup: nginx-backup Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: auto Namespace mappings: <none> Label selector: <none> Restore PVs: auto
8.5 Verify restore succeeded
Now we need to see if the namespace, DaemonSet and service has been restored.
$ kubectl get ns NAME STATUS AGE cassandra Active 40h default Active 5d20h kube-public Active 5d20h kube-system Active 5d20h nginx-example Active 17s pks-system Active 5d20h velero Active 17h wavefront-collector Active 41h $ kubectl get svc --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra cassandra ClusterIP None <none> 9042/TCP 40h default kubernetes ClusterIP 10.100.200.1 <none> 443/TCP 5d20h default wavefront-proxy ClusterIP 10.100.200.56 <none> 2878/TCP 46h kube-system kube-dns ClusterIP 10.100.200.2 <none> 53/UDP,53/TCP 5d20h kube-system kube-state-metrics ClusterIP 10.100.200.187 <none> 8080/TCP,8081/TCP 41h kube-system kubernetes-dashboard NodePort 10.100.200.160 <none> 443:32485/TCP 5d20h kube-system metrics-server ClusterIP 10.100.200.52 <none> 443/TCP 5d20h nginx-example my-nginx LoadBalancer 10.100.200.225 100.64.0.1,192.168.191.67 80:32350/TCP 23s pks-system fluent-bit ClusterIP 10.100.200.175 <none> 24224/TCP 5d20h pks-system validator ClusterIP 10.100.200.149 <none> 443/TCP 5d20h velero minio NodePort 10.100.200.82 <none> 9000:32109/TCP 17h
Note that the nginx service has been restored but it has been assigned a new IP address by the LoadBalancer. This is normal. Now let’s see if we can successfully reach our nginx web service on that IP address. Yes I can! Looks like the restore was successful.
Cool. Backups and Restore are now working on Kubernetes deployed on vSphere+Enterprise PKS using Velero 1.0. If you want to see the steps involved in backing up persistent volumes as well, check back on some of my earlier Velero posts. Also check out the official Velero 1.0 docs. You may also be interested in listening to a recent podcast we had on Velero.