Getting Started with Data Services Manager 2.0 – Part 3: Deploy PostgreSQL

Cormac

5 months ago

So far, we have seen how to create an infrastructure policy and how to configure the DSM Provider appliance. In this third post in the series, I will show you how to deploy a Data Service via Data Services Manager (DSM) 2.0. The data service in question is PostgreSQL, an open-source database which our telemetry tells us is a very popular database deployed on vSphere infrastructure by our customers. Let’s examine the steps involved in deploying PostgreSQL via the DSM UI, whilst also noting that DSM has a very rich API allowing deployment of these data services via various automation and programmatic tools, which we will look at in some future posts. Once again I will mention the caveat that this product is not yet generally available (GA), so some of the workflows may end up being different from what I am showing you here. However, for all intents and purposes, this post gives a reasonably correct representation of what we you will see at GA time.

Log into DSM UI

In setting up the infrastructure policy in part 1, we also saw how to create a permission for a DSM Admin. Use this permission to log onto the DSM appliance. Simply point a browser to your DSM appliance (https) to launch the UI and provides the login details. Note however that there is no need for a DSM Admin to do this database provisioning task. Another option is to create DSM User accounts and hand these off to end-users so they can provision databases themselves (self-service). Or another approach could be for the DSM Admin to create the databases, ensuring that it is built correctly, and then assign ownership of the database to an end-user, providing them with the connection string to the database. There are multiple workflow options to explore.

Basic Information

Immediately after logging in, you will be placed in the dashboard view of the DSM UI. From there, you can either click on the ‘Create Database’ link in the Databases window or else navigate to the Databases menu item and then click on the ‘Create Database. button. This will launch the create database wizard, and drop you into the first step, which is to give basic information about the database that you wish to create. First, you will need to select the Database Engine. Note that DSM 2.0 will support 3 databases to begin with. We will support PostgreSQL and MySQL which are based on Tanzu SQL from VMware. The third database is AlloyDB Omni. We have partnered with Google to offer a tech preview of this database to our customers in DSM 2.0. One benefit of AlloyDB Omni is that includes some specific extensions and features to make it works well as a vector database. Vector databases are a critical component in VMware’s Private AI stack . More on this in a future post. In this post, I am going to select Postgres as my database.

After selecting Postgres as the database engine, more information is required. This information includes a Database Version. DSM will support multiple versions of Postgres at GA. I have chosen major version 15. I also need to give an Instance Name for the database. The Instance Name will also be used as the Database Name, but you can change this if you wish. Possibly the most important decision is to select the correct Replica Mode for the database. If you just want a simple, standalone database with no other replicas, then you can choose Single Server. Whilst this does not offer any replication of the database, the database can still be protected by vSphere HA which can restart the database VM on another ESXi host in the vSphere Cluster in the event of an ESXi host failure. Note that if Replica Mode is set to Single Server, then the Topology is not configurable. Instead, the database ‘Primary’ replica and the special Postgres component called the ‘Monitor’ reside on the same VM/node. Finally, the we create a special Admin Username called pgadmin for accessing the database. This admin username can also be modified.

For the purposes of demonstrating Topology, let’s assume that this is a production database which requires replicas. In that case, we would choose the Single vSphere Cluster as the Replica Mode. Now we get a choice on the Topology settings.

We can decide to have 1 more replica or 3 more replicas. Note that with Postgres, the Monitor component is also moved off to its own VM / node. Thus, if a 3 node topology is chosen, you get the Primary, 1 x Replica and a Monitor rolled out. With a 5 node topology, you get the Primary, 3 x Replica and a Monitor rolled out. That completes the Basic Information section. The next section related to the Infrastructure Policy that we built in part 1.

Infrastructure

The first step in the Infrastructure section is to choose an Infrastructure Policy created previously. The infrastructure policy has compute, storage, networking and VM sizing information. It allows a vSphere administrator have control over which resources on their vSphere infrastructure can be consumed by the data services provisioned by DSM. If there are multiple infrastructure policies created by the vSphere administrator, then one of the policies must be chosen at this point.

If the Infrastructure Policy has multiple compute resources associated with it, e.g. multiple vSphere clusters, or multiple Resource Pools from different vSphere clusters, then one can choose for DSM to automatically choose a compute resource, or choose to select a compute resource manually. By default, automatic placement is selected (Select Placement Manually is disabled), as shown below.

If you decide to do manual placement, then enable the slider button to display compute options. In the example below, the infrastructure policy actually has two vSphere clusters associated with it. You can now choose either one of the clusters for the placement of the Postgres database. All components belonging to the database will be placed on the cluster you select. Note that if placement is set to automatic, DSM will choose one of the clusters and place all of the Postgres components on that one cluster – it will not distribute or move components across clusters in DSM 2.0. Multi-AZ deployments of data services across clusters, placing different replicas on different vSphere clusters, is something we are looking into for a future release of DSM.

You may have noticed that the Storage Policy is automatically populated. This is because the infrastructure policy only has a single storage policy associated with it. If there are multiple storage policies in the infrastructure policy, then a drop-down would be provided to select a storage policy for the deployment. But since there is only one, it is automatically populated.

However, multiple VM classes are added to this Infrastructure Policy, so now we have a choice of which VM Class to pick. Simply choose one which to decide on the size of the VMs to run the Postgres database.

Backup and Maintenance

We now come to two of the most interesting aspect of Data Services Manager. All data services deployed by DSM, by default, have backups enabled. For Postgres, we take a full backup of the database once per week and then carry out daily incremental backups. DSM also takes regular backups of the WAL (Write Ahead Logs) meaning that we are essentially doing continuous backups. This also means that we can restore the Postgres database to any point in time (PIT). The only step required at this point is to choose a backup location, which we would have configured in part 2 when setting up the DSM appliance/provider. The retention period is 30 days by default, but this can also be changed.

If you do not wish to use the default backup schedule (Full weekly on Saturday at 23:59, Incremental daily at 23:59), you can disable the default backup schedule and add your own. You need to add at least one backup schedule, but you should probably add two; one for full backups and the other for incremental.

Maintenance related to the life-cycle management of the databases. It defines whether databases are automatically updated when there is a new patch release (minor version) for a specific version of a database engine. With this feature enables, DSM will download the new patch or image and do a rolling upgrade of your database during the defined maintenance window without any operator intervention. Of course, rolling upgrades could also be initiated manually if there is a critical patch released. However, this feature avoids having to upgrade each database manually, which could be tedious. The Maintenance Window is enabled by default, and the slot for maintenance defaults to Saturdays at 23:59. The window is open for 6 hours.

Advanced Settings – Database Options

The final part of configuring the Postgres database is the advanced settings. In DSM v1.x, there were only a finite number of advanced settings that we allowed to be modified. In DSM 2.0, we are supporting all advanced settings to be applied to the database. In the simple example below, I am adding a single parameter called max_connections and setting it to 10. Multiple parameters can be added. This is something that many of our customers have been asking for, so it’s great to see this included in the new release.

Summary

That completed the setup steps. The next screen provides a Summary of the configuration. If everything is as expected, you can click the ‘Create Database’ button to start the deployment of the Postgres database.

Database details post deployment

There are a number of tasks which now take place to provision the Postgres database. These details will be discussed in a future post, but for now, lets assume that the database has been provisioned successfully. The Databases view gives us some detail about the status and the readiness of the database.

By clicking on the database Instance Name, more details about the database are revealed. Much of the basic information provided during the provisioning steps previously are displayed, but it also includes a ‘Copy Connection String‘ which will allow Database Administrators (DBAs) or end-users to connect to the database. Other details include the Database Replication, the Infrastructure details and Maintenance Window.

Some other tabs to note for the database are Monitoring and Backup. The Monitoring tab provides insights into the performance and behavior of the database itself. It includes both Alerts and Metrics. This Metrics view is different to the VM view in the vSphere Client. These metrics are coming directly from the database and are of interest to the DBA or end-user to check if the database is behaving as expected. This is another really neat feature of DSM versus home-grown DBaaS implementations. This detailed database information is at your finger-tips from deployment time.

The next tab to show you is the Backup view, which provides details around the backup schedules and the status of the regular WAL backups. Because we continuously capture the Write-Ahead Logs (WAL), we can restore the database to any point in time.

vSphere Administrator Insights

One final view to leave you with is from the vSphere Client. Remember that we are building this product so that vSphere administrators can have visibility into how data services are consuming vSphere resources. Thus, if a vSphere administrator selects and infrastructure policy, they can also see which databases have been provisioned against that policy, as show here. It even provides a breakdown of the different VMs (and their roles) in providing the database.

Conclusion

That concludes the third blog post in this series. Hopefully you now have a good idea of how to deploy and configure Data Services Manager version 2.0, as well as provision a database from the DSM UI. Whilst we have talked about different personas for carrying out the different tasks, such as the vSphere Administrator creating the infrastructure policies and the Data Administrator (or DBA) creating the databases, I do feel that a vSphere Admin could also take on the role of the Data Administrator, creating, managing and monitoring the various data services on behalf of the end-user. This would be similar to how we enabled vSphere administrators to become storage administrators when we launched vSAN. We made it as simple as possible to create, manage and monitor vSAN via the vSphere Client. We have the same goal for Data Services Manager. Hopefully, some of the features show here such as automated backups, automated lifecycle management, and visibility into data services via metrics demonstrate how we are trying to simplify data service management with this product. If you have any feedback or suggestions on what to add to DSM, we’d love to hear from you. Feel free to leave a comment either on the blog or via social media.