Safekeeping – a useful tool for interacting with First Class Disks/Improved Virtual Disks

I have been doing quite a bit of work on First Class Disks (FCD), also known as Improved Virtual Disks (IVD) over the past number of months. One tool that has been extremely useful in improving my understanding of FCDs has been safekeeping, a tool developed by Max Daneri of VMware and which is now available to download on GitHub. If you did not know, FCDs are used extensively in VMware’s new Cloud Native Storage (CNS) offering that is currently available with vSphere/vSAN 6.7U3. Now, whilst the primary aim of this tool is to help backup vendors become familiar with the nuances of backing up and restoring FCDs, it does have some other really interesting use-cases.

One area that I found safekeeping to be extremely beneficial is in dealing with stranded or orphaned FCD entries, especially in the context of CNS. These orphans could occur for a few reasons, but typically if you manually delete VMDKs outside of Kubernetes, the CNS database that tracks the PV to VMDK is bypassed so is not updated, and you will be left with PVs displayed in the CNS UI that no longer exist. I’ve used safekeeping in the past as a way of cleaning these stale entries in a controlled fashion.

*** Warning: Safekeeping is a proof-of-concept tool and is not supported by VMware. Use at your own risk ***

VM Deployment

To deploy Safekeeping, you will need a Centos 7 distro deployed in a VM. For my setup, I chose a ‘Server with GUI’ environment, and included ‘Development Tools’ as an Add-On.

I also installed VMware Tools. I’m not sure these were needed, but I installed them anyway out of force of habit. The following steps were all done as root, as I ran into some issues doing the deployment via sudo as another admin user.

The next step was to install some outstanding tools in order to build the safekeeping binaries. I installed ant, and also the Java OpenJDK. I also had to install the Java OpenJDK Devel package to get an updated version of the javac binary. Version 1.8 is required by safekeeping. All 3 were installed via yum:

# yum install -y ant
# yum install -y java-1.8.0-openjdk
# yum install -y java-1.8.0-openjdk-devel

Install safekeeping and VDDK

The next step is to download the safekeeping ZIP from GitHub, and the Virtual Disk Development Kit (VDDK) from VMware. The safekeeping code is found here; the VDDK is found here. Once the safekeeping ZIP has been downloaded onto the Centos VM, simply unzip it. It creates a directory called safekeeping-master. Under the safekeeping-master directory, there is a sub-directory called packages. The VDDK tar.gz (do not extract it) should be placed here:

# cp VMware-vix-disklib-6.7.3-14389676.x86_64.tar.gz safekeeping-master/packages/

Now change directory to safekeeping-master and run the following commands:

# ant configure
# ant install

If everything goes as planned, you should now have a safekeeping binary successfully created.

Using safekeeping

On launching safekeeping, you will be prompted for a number of inputs. This is due to the primary use case of safekeeping, which is related to backing up and restoring of FCDs. Thus there are numerous ‘repositories’ for storing ‘backups’ that can be configured for this testing. However, since we are focused on managing stranded or orphaned FCD entries, we do not need to worry too much about these. I did keep a local FILESTORAGE repository, as this may be useful if I want to test some additional operations on FCDs later on (e.g. snapshot/backup/restore). Here is the complete list of inputs I provided when I launched safekeeping for the first time:

# safekeeping
Safekeeping Version 1.0.1 - VDDK Version 6.7.3
Server: safekeeping.rainpole.com ip:10.27.51.76 uuid:b97d2242-e51f-cea5-5504-59a91d6c022c
There is a problem with the configuration. Manual reconfiguration enforced

PSCPROVIDER
VMware Platform Service Controller FQDN or IP []vcsa-06-b.rainpole.com

PSCPROVIDER
         Login account []administrator@vsphere.local
         Password <hidden>******
        Repeat password******

AMAZONS3
Activate the Plugin [false]
S3 Region [us-west-2]
S3 Backet Name/folder []
S3 Access Key []
S3 Secret Key <hidden>

FILESTORAGE
Activate the Plugin [true]
File Archive directory [/opt/vmware/safekeeping/archive]

NFSSTORAGE
Activate the Plugin [false]
NFS Absolute Path (server:/Export) [localhost:/vol]
User UID [0]
User GID [0]
Max number of retry [5]
Use port <1024 [false]
Use portmap port <1024 [false]
NFS Timeout [20]

GLOBAL
Default Target Repository class (amazonS3 fileStorage nfsStorage ) [amazonS3]fileStorage
Accept Untrusted Certificate [true]
VDDK Transport Mode for Full backups [hotadd]
VDDK Transport Mode for Incremental backups [nbdssl]
VDDK Transport Mode for Restore [hotadd]

Max number of concurrent post threads [5]
Max number of concurrent get threads [5]
Max atomic block size in MB (increase the value required more RAM) [250]
Enable compression [true]

FILTER
Virtual Machine Folder Filter []
Virtual Machine Resource pool []

Do you conferm? (Yes/No)Yes

Interactive Mode
<help> for help  - <quit> to leave
vmbk#

And now we are able to do operations on our First Class Disks, or IVDs as referred to by safekeeping. One of the things we can do is list the IVDs that are currently present on the vCenter server that I pointed to during the safekeeping setup.

vmbk# ivd -list
Connecting...
Connected to Platform Service Controller: https://vcsa-06-b.rainpole.com/sts/STSService
LookupService reports 1 VimService Instance
Connected to VimService uuid: 314ef7ae-3105-42de-8957-6a504419d257 url: https://vcsa-06-b.rainpole.com:443/sdk
        VMware vCenter Server 6.7.0 build-14368073 - Api Version: 6.7.3

Connected to Storage Profile Service Ver.2.0 url: https://vcsa-06-b.rainpole.com/pbm
Connected to VMware Virtual Storage Lifecycle Manager Service 1.0.0 url: https://vcsa-06-b.rainpole.com/vslm/sdk
Connected to Vapi Service url: https://vcsa-06-b.rainpole.com/api

entity  uuid                                    name                                 size         cbt   snapshot    attached               datastore       path

ivd     57f69494-24a2-4b28-9d1b-bc43fe08a0d2    pvc-8c7b2535-eb45-11e9-80e4...        1024MB    false         no         yes           vsanDatastore       33d05a5d-e436-3297-94f7-246e962f4910/271c0c4065e545e7a2a1fefdd17cf4d3.vmdk
ivd     937b445e-bd7d-4576-987e-5f7e32147af1    pvc-8c7968d5-eb45-11e9-80e4...        1024MB    false         no         yes           vsanDatastore       33d05a5d-e436-3297-94f7-246e962f4910/ff34579cbed54dc994976885e22c7cb9.vmdk
ivd     d638ac80-88f8-40e4-bca7-281bc7b98fde    pvc-8c774366-eb45-11e9-80e4...        1024MB    false         no         yes           vsanDatastore       33d05a5d-e436-3297-94f7-246e962f4910/6a02671fb7ed4358a02f39e71e764bd2.vmdk

vmbk#

I can compare these IVDs to the CNS UI in that vCenter server, and we should observe that they do indeed match:

And now we come to the use case. Let’s assume that the Kubernetes cluster was manually removed from vSphere, or that individual VMDKs were removed outside of Kubernetes. This would leave some stale PVs in the FCD database, which would still be displayed here. You can use the ivd -help command at the vmbk# prompt to display usage and some examples. One example shows how you could remove an FCD/IVD, which also removes the stale entry from the FCD db in vCenter, and thus removes it from the UI.

Example:

# ivd -remove ivd:9a583042-cb9d-5673-cbab-56a02a91805d

Delete the Improved Virtual Disk with uuid 9a583042-cb9d-5673-cbab-56a02a91805d

Caveats

A few things to be aware of before finishing this post.

  1. Safekeeping is a proof of concept tool. As it is available on GitHub, I am sure you can draw your own conclusions on the support around this tool. If it is not obvious, VMware does not support safekeeping in any way, shape or form!
  2. On trying to run the ‘ant install‘ command via sudo, I kept hitting ‘Invalid cross-device link’ errors. When I did the deployment as root, I had no such issues.
  3. Java 1.8 is required, both the SDK and the javac. Make sure you install the correct packages.
  4. This tool does so much more than just delete IVDs/FCDs. If you have a non-production environment, you can check out the many other features around IVD snapshot/backup/restore. Very interesting from a Kubernetes PV perspective.

2 Replies to “Safekeeping – a useful tool for interacting with First Class Disks/Improved Virtual Disks”

  1. HI Cormac,

    Do you know if this functionality is being planned for addition to Velero? Or is it only being used for vendor-based education of PV backup/restore for integration into their own products?

    Great article and very useful, I am going to download and start playing around with this. Thanks for all you do!

    Best,
    Tim

    1. We do have some plans to create our own vSphere plugin for Velero rather than rely on the third party restic plugin. The plan is for this plugin to leverage snapshots (VADP approach) and then enable mobility of the incremental snapshot backups to a third party. I’ll be able to share more as we get closer to the next major vSphere release.

Comments are closed.