A first look at the Couchbase Operator

A few weeks back, I took a look at Heptio Velero, formely known as Ark. Velero provides backup and restore capabilities for cloud native applications. During that research, I used a Couchbase DB as my application of choice for backup/restore. After speaking to Couchbase regarding that blog post, they strongly recommended I try the new Couchbase operator rather than the StatefulSet method that I was using for the application. Couchbase talk about the advantages of the operator approach over StatefulSets here.

Now, while Couchbase provide steps on how to deploy Couchbase with their operator, they create it in the default K8s namespace. In my test, I want to put Couchbase in its own namespace. The steps provided here are provided to get you started with the new Couchbase Operator, running on vSphere and vSAN infrastructure, in its own Kubernetes namespace. I also talk about some issues with the bundled load generating tool, called pillowfight.

Couchbase provide prescriptive instructions on how to get started with their operator here. It includes all the necessary configuration files. A few things about the operator:

When loaded, it downloads the Operator Docker image as specified in the operator.yaml file. It uses a deployment construct so that it can restart if the pod it is running in dies.
It creates the CouchbaseCluster custom resource definition (CRD)
It starts listening for CouchbaseCluster events.

I made a few modifications to allow Couchbase to run in its own namespace:

I first of all created a new namespace (obviously) called couchbase.
When the cluster role was created, I created the service account in the new couchbase namespace and I then assigned the cluster role to that service account using a cluster role binding.
I modified the operator.yaml file to include a metadata.namespace=couchbase setting so that it applies to the couchbase namespace

By monitoring the logs of the couchbase operator pod, we can observe the following startup messages:

$ kubectl logs couchbase-operator-6fcfbd8599-zqsh2 -n couchbase
time="2019-03-20T09:27:41Z" level=info msg="couchbase-operator v1.1.0 (release)" module=main
time="2019-03-20T09:27:41Z" level=info msg="Obtaining resource lock" module=main
time="2019-03-20T09:27:41Z" level=info msg="Starting event recorder" module=main
time="2019-03-20T09:27:41Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
time="2019-03-20T09:27:41Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"couchbase\", Name:\"couchbase-operator\", UID:\"68fece18-4af2-11e9-be9b-005056a24d92\", APIVersion:\"v1\", ResourceVersion:\"1596774\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' couchbase-operator-6fcfbd8599-zqsh2 became leader" module=event_recorder
time="2019-03-20T09:27:41Z" level=info msg="I'm the leader, attempt to start the operator" module=main
time="2019-03-20T09:27:41Z" level=info msg="Creating the couchbase-operator controller" module=main
time="2019-03-20T09:27:46Z" level=info msg="CRD initialized, listening for events..." module=controller
time="2019-03-20T09:27:46Z" level=info msg="starting couchbaseclusters controller"

I was now ready to deploy the Couchbase cluster using the new cbopctl CLI tool. I also needed to make a few changes to the default cluster configuration file (couchbase-cluster-sc.yaml).

I placed it in the couchbase namespace with the metadata.namespace entry
I set spec.disableBucketManagement to true which allows me to make changes to buckets via UI/CLI (otherwise I have to make all changes via edits to the YAML file)
I added Persistent Volumes for the default and data mounts (I had to create a new StorageClass for the volumeClaimTemplate to use for this – see below)

Here is the full CouchbaseCluster YAML file with my changes highlighted.

apiVersion: couchbase.com/v1
kind: CouchbaseCluster
metadata:
  name: cb-example
  namespace: couchbase
spec:
  securityContext:
    fsGroup: 1000
  baseImage: couchbase/server
  version: enterprise-5.5.2
  authSecret: cb-example-auth
  exposeAdminConsole: true
  disableBucketManagement: true
  adminConsoleServices:
    - data
  cluster:
    dataServiceMemoryQuota: 256
    indexServiceMemoryQuota: 256
    searchServiceMemoryQuota: 256
    eventingServiceMemoryQuota: 256
    analyticsServiceMemoryQuota: 1024
    indexStorageSetting: memory_optimized
    autoFailoverTimeout: 120
    autoFailoverMaxCount: 3
    autoFailoverOnDataDiskIssues: true
    autoFailoverOnDataDiskIssuesTimePeriod: 120
    autoFailoverServerGroup: false
  buckets:
    - name: default
      type: couchbase
      memoryQuota: 128
      replicas: 1
      ioPriority: high
      evictionPolicy: fullEviction
      conflictResolution: seqno
      enableFlush: true
      enableIndexReplica: false
  servers:
    - size: 3
      name: all_services
      services:
        - index
        - query
        - search
        - eventing
        - analytics
        - data
      pod:
        volumeMounts:
          default: couchbase
          data:  couchbase
  volumeClaimTemplates:
    - metadata:
        name: couchbase
      spec:
        storageClassName: "couchbasesc"
        resources:
          requests:
            storage: 1Gi

I am skipping over the authentication and user requirements which are all documented on the Couchbase site. However, once the application has been deployed, you should be able to see the following in the couchbase operator pod logs:

$ kubectl logs couchbase-operator-6fcfbd8599-zqsh2 -n couchbase
.
.
time="2019-03-20T09:48:34Z" level=info msg="Watching new cluster" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Janitor process starting" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Setting up client for operator communication with the cluster" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Cluster does not exist so the operator is attempting to create it" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Creating headless service for data nodes" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Creating NodePort UI service (cb-example-ui) for data nodes" cluster-name=cb-example module=cluster
time="2019-03-20T09:48:34Z" level=info msg="Creating a pod (cb-example-0000) running Couchbase enterprise-5.5.2" cluster-name=cb-example module=cluster

And if everything comes up successfully, you can query the pods, persistent volumes and services as they initialize.

$ kubectl get po -n couchbase
NAME                                  READY   STATUS    RESTARTS   AGE
cb-example-0000                       1/1     Running   0          7m8s
cb-example-0001                       1/1     Running   0          6m7s
cb-example-0002                       1/1     Running   0          5m14s
couchbase-operator-6fcfbd8599-zqsh2   1/1     Running   0          28m

$ kubectl get pv -n couchbase
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                STORAGECLASS   REASON   AGE
pvc-100e3c17-40e3-11e9-be9b-005056a24d92   10Gi       RWO            Delete           Bound    velero/minio-pv-claim-1                              minio-sc                12d
pvc-53c6962f-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0000-00-default   couchbasesc             7m51s
pvc-5b30f298-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0000-01-data      couchbasesc             7m44s
pvc-795bcf7b-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0001-00-default   couchbasesc             6m51s
pvc-7edc6a4d-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0001-01-data      couchbasesc             6m43s
pvc-97e9f6a9-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0002-00-data      couchbasesc             6m2s
pvc-9bcc3d4d-4af5-11e9-be9b-005056a24d92   1Gi        RWO            Delete           Bound    couchbase/pvc-couchbase-cb-example-0002-01-default   couchbasesc             5m50s

$ kubectl get svc -n couchbase
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                          AGE
cb-example       ClusterIP   None                           8091/TCP,18091/TCP               8m55s
cb-example-srv   ClusterIP   None                           11210/TCP,11207/TCP              8m55s
cb-example-ui    NodePort    10.100.200.132   <none>        8091:32635/TCP,18091:30198/TCP   3m20s

So, to recap, the steps are:

Follow the documented steps to set up the couchbase ClusterRole
Create the couchbase namespace – kubectl create ns couchbase
Create the couchbase operator service account in the couchbase namespace – kubectl create serviceaccount couchbase-operator –namespace couchbase
Create the operator (modified for the couchbase namespace) – kubectl create -f operator.yaml
Create the necessary secrets (modified for the couchbase namespac) – kubectl create -f secret.yaml
Create the couchbase cluster using cbopctl – cbopctl create -f couchbase-cluster-sc.yaml

In the last output, the services, we have the port mappings for the Couchbase UI, both for http and https. If we connect to any of the K8s slave nodes with those ports, we should be able to access our Couchbase deployment, using the Administrator/password credential provided in our configuration.

Initially there are no buckets created. There is one defined in the cluster configuration but we overrode it with the advanced setting. We can quickly remedy that by creating a few temp ones. I will add two – the first is called default and the second is called cormac. At the moment there are no items in either bucket.

Couchbase also bundle a utility called pillowfight which is a very useful way of populating the buckets. For some reason, I had some issues with the bundled “sequioatools” version of pillowfight. Once I reverted to using the “couchbaseutils” version, all was good. You will need to set up the appropriated user credentials to do this, but once again this is all documented on the main Couchbase operator site. Here are my sample YAML files for populating the default bucket:

apiVersion: batch/v1
kind: Job
metadata:
  name: pillowfight
  namespace: couchbase
spec:
  template:
    metadata:
      name: pillowfight
    spec:
      containers:
      - name: pillowfight
        image: couchbaseutils/pillowfight:v2.9.3
        command: ["cbc-pillowfight",
                  "-U", "couchbase://cb-example-0000.cb-example.couchbase.svc.cluster.local/default?select_bucket=true",
                  "-I", "10000", "-B", "1000", "-c", "10", "-t", "1", "-P", "password"]
      restartPolicy: Never

The only difference for the cormac bucket is the command syntax is slightly different:

command: ["cbc-pillowfight",
          "-U", "couchbase://cb-example-0000.cb-example.couchbase.svc.cluster.local/cormac",
          "-I", "10000", "-B", "1000", "-c", "10", "-t", "1", "-u", "Administrator", "-P", "password"]

After running the pillowfight jobs, you will see new pods that have already completed their tasks:

$ kubectl get po -n couchbase
NAME                                  READY   STATUS      RESTARTS   AGE
cb-example-0000                       1/1     Running     0          20m
cb-example-0001                       1/1     Running     0          19m
cb-example-0002                       1/1     Running     0          18m
couchbase-operator-6fcfbd8599-ggv98   1/1     Running     0          24m
create-user-dk6xg                     0/1     Completed   0          89s
pillowfight-fqvgp                     0/1     Completed   0          70s
pillowfightcormac-dmqnf               0/1     Completed   0          7s

More importantly, if we take a look at the Couchbase UI, we see that we now have 1,000 items in each bucket:

And that’s it. You are up and running with your Couchbase operator. And to close, this was provisioned in a K8s cluster on top of VMware PKS, vSphere and vSAN infrastructure. BTW, the issue with pillowfight was reported here.

Published by Cormac