Getting started with Velero 1.0.0-rc.1

Last week, the Velero team announced the availability of release candidate (RC) version 1.0.0. I was eager to get my hands on it and try it out. Since it is RC (and not GA), I thought I would just deploy a fresh environment for testing.  The guidance from the Velero team is to test it out in your non-critical environments! On a number of Velero github sites, the links to download the binaries do not appear to be working, plus some of the install guidance is a little sparse. Anyhow, after some trial and error, I decided it might be useful to document the steps I went through to fully deploy a 1.0.0 RC version.

Velero Client

First of all, the site to get the 1.0.0-rc.1 tar-ball is here – https://github.com/heptio/velero/releases/tag/v1.0.0-rc.1. From here, you can download the appropriate tar-ball, and this will provide you with the velero client binary.

cormac@pks-cli:~/Velero_v1.0rc$ tar zxvf velero-v1.0.0-rc.1-linux-amd64.tar.gz
velero-v1.0.0-rc.1-linux-amd64/LICENSE
velero-v1.0.0-rc.1-linux-amd64/examples/README.md
velero-v1.0.0-rc.1-linux-amd64/examples/minio
velero-v1.0.0-rc.1-linux-amd64/examples/minio/00-minio-deployment.yaml
velero-v1.0.0-rc.1-linux-amd64/examples/nginx-app
velero-v1.0.0-rc.1-linux-amd64/examples/nginx-app/README.md
velero-v1.0.0-rc.1-linux-amd64/examples/nginx-app/base.yaml
velero-v1.0.0-rc.1-linux-amd64/examples/nginx-app/with-pv.yaml
velero-v1.0.0-rc.1-linux-amd64/velero

From here, you can retrieve the client side binary ‘velero’. Simpy copy this to your local PATH, e.g. /usr/local/bin. You are now good to go from a client perspective.

Velero Server

When you examine the extracted RC tar-ball, you will see that some files are noticeably missing when compared to Velero v0.11. The files are the following:

config/common/00-prereqs.yaml
config/minio/00-minio-deployment.yaml  
config/minio/05-backupstoragelocation.yaml  
config/minio/20-deployment.yaml
config/minio/30-restic-daemonset.yaml

Now, in my original Velero post, I modified config/minio/00-minio-deployment.yaml to give me a decent sized Minio S3 for my on-prem, on-vSphere backups. You can find the details in this post here. Therefore, I was very interested in keeping this configuration. Therefore,  I simply copied these YAML files from my v0.11 deployment folder to my v1.0.0-rc.1 deployment folder. There were only 2 additional changes needed to the files, and these changes are simply to use the latest Velero server and restic images. The files that needed updating were 20-deployment.yaml and 30-restic-deployment.yaml.

From:
20-deployment.yaml:          image: gcr.io/heptio-images/velero:v0.11.0
30-restic-daemonset.yaml:    image: gcr.io/heptio-images/velero:v0.11.0
To:
20-deployment.yaml:          image: gcr.io/heptio-images/velero:v1.0.0-rc.1
30-restic-daemonset.yaml:    image: gcr.io/heptio-images/velero:v1.0.0-rc.1

 

If everything is working correctly after you have deployed Velero (applied the new common and minio folders), you should get the following new RC version numbers.

cormac@pks-cli:~$ velero version
Client:
        Version:v1.0.0-rc.1
        Git commit: d05f8e53d8ecbdb939d5d3a3d24da7868619ec3d
Server:
        Version:v1.0.0-rc.1

It would also be worth making sure that the velero and restic pods, as well as the minio pod, has started successfully. But as mentioned, this should be no different to v0.11. Note that when using PKS, the ‘allow privileged‘ checkbox should be selected on the PKS tile in Pivotal Ops Manager for the restic portion to work correctly. Again, refer to my earlier blog post if you want full setup instructions. It should look something like this.

cormac@pks-cli:~$ kubectl get pod -n velero

NAME                      READY   STATUS      RESTARTS   AGE
minio-67fc7ffcdf-r8r4g    1/1     Running     0          4m51s
minio-setup-hj24t         0/1     Completed   1          4m51s
restic-bjdp2              1/1     Running     0          4m50s
restic-cj8sx              1/1     Running     0          4m50s
restic-cljrk              1/1     Running     0          4m50s
restic-q4xdn              1/1     Running     0          4m50s
velero-6756989966-dmhrc   1/1     Running     0          4m50s

Backup and Restore

Now for the real test. Let’s do a backup of some application, delete the original, and see if we can restore it once more with this RC version of Velero. I’m going to use my tried and trusted Cassandra instance, running in its own namespace also called Cassandra. I’ve already populated it with some simple data. The idea is to take a backup with Velero, delete the namespace where Cassandra exists, then restore it once again with Velero. My Cassandra instance has a single PVC/PV, and since this is running on vSphere and PKS, we will be using the restic plugin for snapshot management that comes with Velero. I have also added the correct annotations to get the PV contents includes in the backup. Once I have completed the backup, and deleted the namespace, I will initiate a restore. I will then verify that my database contents are still intact.

Here are the complete steps. They should be quite easy to follow.


cormac@pks-cli:~$ kubectl -n cassandra annotate pod/cassandra-0 backup.velero.io/backup-volumes=cassandra-data --overwrite
pod/cassandra-0 annotated


cormac@pks-cli:~$ velero backup create cassandra
Backup request "cassandra" submitted successfully.
Run `velero backup describe cassandra` or `velero backup logs cassandra` for more detail


cormac@pks-cli:~$ velero backup describe cassandra --exclude-namespaces default,kube-public,kube-system,pks-system,velero
Name: cassandra
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>

Phase: Completed

Namespaces:
 Included: *
 Excluded: default, kube-public, kube-system, pks-system, velero

Resources:
 Included: *
 Excluded: <none>
 Cluster-scoped: auto

Label selector: <none>

Storage Location: default

Snapshot PVs: auto

TTL: 720h0m0s

Hooks: <none>

Backup Format Version: 1

Started: 2019-05-13 09:02:01 +0100 IST
Completed: 2019-05-13 09:02:06 +0100 IST

Expiration: 2019-06-12 09:02:01 +0100 IST

Persistent Volumes: <none included>

Restic Backups (specify --details for more information):
 Completed: 1

cormac@pks-cli:~$ velero backup get
NAME      STATUS    CREATED                        EXPIRES  STORAGE LOCATION  SELECTOR
cassandra Completed 2019-05-13 09:02:01 +0100 IST  29d      default           <none>


cormac@pks-cli:~$ velero backup logs cassandra | grep restic
time="2019-05-13T08:02:01Z" level=info msg="Skipping persistent volume\
 snapshot because volume has already been backed up with restic."\
backup=velero/cassandra group=v1 logSource="pkg/backup/item_backupper.go:390"\
 name=pvc-71eedeae-7328-11e9-b153-005056a29b20 namespace=cassandra\
 persistentVolume=pvc-71eedeae-7328-11e9-b153-005056a29b20 resource=pods


cormac@pks-cli:~$ kubectl delete ns cassandra
namespace "cassandra" deleted


cormac@pks-cli:~$ kubectl get ns
NAME          STATUS   AGE
default       Active   15d
kube-public   Active   15d
kube-system   Active   15d
pks-system    Active   15d
velero        Active   21m


cormac@pks-cli:~$ velero restore create cassandra-restore --from-backup cassandra
Restore request "cassandra-restore" submitted successfully.
Run `velero restore describe cassandra-restore` or `velero restore logs cassandra-restore` for more details.


cormac@pks-cli:~$ velero restore describe cassandra-restore
Name: cassandra-restore
Namespace: velero
Labels: <none>
Annotations: <none>

Phase: Completed

Backup: cassandra

Namespaces:
Included: *
Excluded: <none>

Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto

Namespace mappings: <none>


Label selector: <none>

Restore PVs: auto

Restic Restores (specify --details for more information):
Completed: 1


cormac@pks-cli:~$ velero restore get
NAME              BACKUP    STATUS    WARNINGS ERRORS CREATED                       SELECTOR
cassandra-restore cassandra Completed 0        0      2019-05-13 09:13:34 +0100 IST <none>


cormac@pks-cli:~$ kubectl get ns
NAME          STATUS   AGE
cassandra     Active   51s
default       Active   15d
kube-public   Active   15d
kube-system   Active   15d
pks-system    Active   15d
velero        Active   22m


cormac@pks-cli:~$ kubectl get pod -n cassandra
NAME          READY   STATUS    RESTARTS   AGE
cassandra-0   0/1     Running   0          49s


cormac@pks-cli:~$ kubectl get pvc -n cassandra
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cassandra-data-cassandra-0   Bound    pvc-71eedeae-7328-11e9-b153-005056a29b20   1Gi        RWO            cass-sc        65s


cormac@pks-cli:~$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                  STORAGECLASS   REASON   AGE
pvc-71eedeae-7328-11e9-b153-005056a29b20   1Gi        RWO            Delete           Bound    cassandra/cassandra-data-cassandra-0   cass-sc                 72s
pvc-ba7ad6a0-7325-11e9-b153-005056a29b20   10Gi       RWO            Delete           Bound    velero/minio-pv-claim-1                minio-sc                19m

cormac@pks-cli:~$ kubectl exec -it cassandra-0 -n cassandra -- nodetool status
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address       Load        Tokens  Owns (effective)  Host                                  ID Rack
UN 10.200.87.32  112.04 KiB  32      100.0%            581614a6-eb8a-4498-936a-427ec6d2897e  Rack1-K8Demo


cormac@pks-cli:~$ kubectl exec -it cassandra-0 -n cassandra -- cqlsh
Connected to K8Demo at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> use demodb;
cqlsh:demodb> select * from emp;

emp_id | emp_city | emp_name | emp_phone | emp_sal
--------+----------+----------+-----------+---------
    100 |     Cork |   Cormac |       999 | 1000000
(1 rows)

cqlsh:demodb>

 

Looks good to me. I successfully backed up my Casandra app, deleted the namespace on which it was running (which removed any PODs and PVs), then successfully restored it from Velero, and validated that the data is still in place. Looks like RC is working for us.