More Velero – Cassandra backup and restore
In my previous exercise with Heptio Velero, I looked at backing up and restoring a Couchbase deployment. This time I turned my attention to another popular containerized application, Cassandra. Cassandra is a NoSQL database, similar in some respects to Couchbase. Once again, I will be deploying Cassandra as a set of containers and persistent volumes from Kubernetes running on top of PKS, the Pivotal Container Service. And again, just like my last exercise, I will be instantiating the Persistent Volumes as virtual disks on top of vSAN. I’ll show you how to get Cassandra up and running quickly by sharing my YAML files, then we will destroy the namespace where Cassandra is deployed. Of course, this is after we have taken a backup with Heptio Velero (formerly Ark). We will then restore the Cassandra deployment from our Velero backup and verify that our data is still intact.
Since I went through all of the initial setup steps in my previous post, I will get straight to the Cassandra deployment, followed by the backup, restore with Velero and then verification of the data.
In my deployment, I went with 3 distinct YAML files, the service, the storage class and the statefulset. The first one shown here is the service YAML for my headless Cassandra deployment. Nothing really much to say here except for the fact that this is headless and we won’t be forwarding any traffic from the pods, thus we don’t need any cluster IP.
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
namespace: cassandra
spec:
# headless does not need a cluster IP
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
Next up in the storage class. Regular readers will be familiar with this concept now. In a nutshell it allows us to do dynamic provisioning of volumes for our application. This storage class uses the K8s vSphere Volume Driver, consumes an SPBM policy called gold and creates virtual disks for persistent volumes on the vSAN datastore of this vSphere cluster.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: vsan
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: thin
storagePolicyName: gold
datastore: vsanDatastore
Lastly, we come to the stateful set itself, which allows scaling of PODS and PVs together. There are a number of things to highlight here. The first is the Cassandra container image version. These can be retrieved from gcr.io/google-samples. I went all the way back to v11 because this image included the cqlsh tool for working on the database. Now there are other options available if you choose to use later versions of the image, such as deploying a separate container with cqlsh, but I found it easier just to log onto the Cassandra containers and running my cqlsh commands from there. I’ve actually pulled down the Cassandra image and pushed it up to my own local harbor registry, which is where I am retrieving it from. One other thing is the DNS name of the Cassandra SEED node. Since I am deploying to a separate namespace called Cassandra, I need to ensure that the DNS name reflects that below. This SEED node is what allows the cluster to form. Last but not least is the volume section. This section references the storage class and allows for the creation of dynamic PVs for each POD in the Cassandra deployment, scaling in and out as needed.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
namespace: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
# My image is on my harbor registry
image: harbor.rainpole.com/library/cassandra:v11
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
# Make sure the DNS name matches the nameserver
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.cassandra.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
volumeClaimTemplates:
- metadata:
name: cassandra-data
# Match the annotation to the storage class name defined previously
annotations:
volume.beta.kubernetes.io/storage-class: vsan
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Let’s take a look at the configuration after Cassandra has been deployed. Note that the statefulset requested 3 replicas.
cormac@pks-cli:~/Cassandra$ kubectl get sts -n cassandra
NAME DESIRED CURRENT AGE
cassandra 3 3 54m
cormac@pks-cli:~/Cassandra$ kubectl get po -n cassandra
NAME READY STATUS RESTARTS AGE
cassandra-0 1/1 Running 0 54m
cassandra-1 1/1 Running 3 54m
cassandra-2 1/1 Running 2 54m
cormac@pks-cli:~/Cassandra$ kubectl get pvc -n cassandra
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-cassandra-0 Bound pvc-c61a6e97-4be8-11e9-be9b-005056a24d92 1Gi RWO vsan 54m
cassandra-data-cassandra-1 Bound pvc-c61ba5d2-4be8-11e9-be9b-005056a24d92 1Gi RWO vsan 54m
cassandra-data-cassandra-2 Bound pvc-c61cadc6-4be8-11e9-be9b-005056a24d92 1Gi RWO vsan 54m
cormac@pks-cli:~/Cassandra$ kubectl get svc -n cassandra
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cassandra ClusterIP None <none> 9042/TCP 55m
It all looks ok from a K8s perspective. We can use this nodetool CLI tool to verify the state of the Cassandra cluster and verify that all 3 nodes have joined.
cormac@pks-cli:~/Cassandra$ kubectl exec -it cassandra-0 -n cassandra -- nodetool status
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.200.30.203 133.7 KiB 32 54.4% a0baa626-ac99-45cc-a2f0-d45ac2f9892c Rack1-K8Demo
UN 10.200.57.61 231.56 KiB 32 67.9% 95b1fdb8-2138-4b5d-901e-82b9b8c4b6c6 Rack1-K8Demo
UN 10.200.99.101 223.25 KiB 32 77.7% 3477bb48-ad60-4716-ac5e-9bf1f7da3f42 Rack1-K8Demo
Now we can use the cqlsh command mentioned earlier to create a dummy table and some contents (Like most of this setup, I simply picked these up from a quick google – I’m sure you can be far more elaborate should you wish).
cormac@pks-cli:~/Cassandra$ kubectl exec -it cassandra-0 -n cassandra -- cqlsh Connected to K8Demo at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4] Use HELP for help. cqlsh> CREATE KEYSPACE demodb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; cqlsh> use demodb; cqlsh:demodb> CREATE TABLE emp(emp_id int PRIMARY KEY, emp_name text, emp_city text, emp_sal varint,emp_phone varint); cqlsh:demodb> INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES (100, 'Cormac', 'Cork', 999, 1000000); cqlsh:demodb> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal --------+----------+----------+-----------+--------- 100 | Cork | Cormac | 999 | 100000 (1 rows) cqlsh:demodb> exit;
Next, we can start with the backup preparations, first of all annotating the Persistent Volumes so that Velero knows to back them up.
cormac@pks-cli:~/Cassandra$ kubectl -n cassandra annotate pod/cassandra-2 backup.velero.io/backup-volumes=cassandra-data
pod/cassandra-2 annotated
cormac@pks-cli:~/Cassandra$ kubectl -n cassandra annotate pod/cassandra-1 backup.velero.io/backup-volumes=cassandra-data
pod/cassandra-1 annotated
cormac@pks-cli:~/Cassandra$ kubectl -n cassandra annotate pod/cassandra-0 backup.velero.io/backup-volumes=cassandra-data
pod/cassandra-0 annotated
Finally, we initiate the backup. This time I am going to tell Velero to skip all the other namespaces so that it only backs up the Cassandra namespace. Note that there are various ways of doing this with selectors, etc. This isn’t necessary the most optimal way to achieve this (but it works).
cormac@pks-cli:~/Cassandra$ velero backup create cassandra --exclude-namespaces velero,default,kube-public,kube-system,pks-system,couchbase
Backup request "cassandra" submitted successfully.
Run `velero backup describe cassandra` or `velero backup logs cassandra` for more details.
I typically put a watch -n 5 before the ‘velero backup describe’ command so I can see it getting regularly updated with progress. When the backup is complete, it can be listed as follows:
cormac@pks-cli:~/Cassandra$ velero backup get
NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR
all Completed 2019-03-21 10:43:43 +0000 GMT 29d default <none>
all-and-cb Completed 2019-03-21 10:51:26 +0000 GMT 29d default <none>
all-cb-2 Completed 2019-03-21 11:11:04 +0000 GMT 29d default <none>
cassandra Completed 2019-03-21 14:43:25 +0000 GMT 29d default <none>
Time to see if we can restore it. As before, we can now destroy our current data. In my case, I am just going to remove the namespace where my Cassandra objects reside (PODs, PVs, service, StatefulSet), and then recover it using Velero.
cormac@pks-cli:~/Cassandra$ kubectl delete ns cassandra namespace "cassandra" deleted cormac@pks-cli:~/Cassandra$ velero restore create cassandra-restore --from-backup cassandra Restore request "cassandra-restore" submitted successfully. Run `velero restore describe cassandra-restore` or `velero restore logs cassandra-restore` for more details.
You can monitor this in the same way as you monitor the backup, using a watch -n 5. You can also monitor the creation of new namespace, PVs and PODs using kubectl. Once everything is backed up, we can verify if the data exists using the same commands as before.
cormac@pks-cli:~/Cassandra$ kubectl exec -it cassandra-0 -n cassandra -- cqlsh
Connected to K8Demo at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> select * from demodb.emp;
emp_id | emp_city | emp_name | emp_phone | emp_sal
--------+----------+----------+-----------+---------
100 | Cork | Cormac | 999 | 100000
(1 rows)
cqlsh>
So we have had a successful backup and restore, using Heptio Velero, of Cassandra running as a set of containers on top in K8s on PKS, and using Persistent Volumes on vSAN – neat!