We are also extending the number of data services available to our customers. In this release, we are announcing partnerships with Google and MinIO. VMware Data Services Manager will enable the seamless delivery of Google’s AlloyDB Omni databases and MinIO’s object stores on vSphere infrastructure. Google’s AlloyDB Omni includes vector-db extensions which makes it an attractive database for VMware’s Private-AI stack. MinIO’s leading object store technology means that we have a mechanism to deliver S3 API-compatible object stores on vSphere infrastructure, something that many customers have been requesting.
The Infrastructure Policy is created by the vSphere Administrator in the vSphere Client. The Infrastructure Policy defines compute, storage, network and virtual machine boundaries. When the infrastructure policy is selected by the Data Administrator (or DB Admin) during data service provisioning, the data service is provisioned on the resources defined by that infrastructure policy only. Thus, a vSphere administrator could build multiple infrastructure policies. Some infrastructure policies could leverage all of the compute resources of a full vSphere cluster whilst others could be using subsets of resources through Resource Pools. Similarly, you might choose to select different datastores or different storage features (via Storage Policies) for the different infrastructure policies. Some storage policies could map to vSAN datastores or vSAN features (e.g., RAID level), whilst others could map to VVol datastores. In fact, through tag-based storage policies, you could also select VMFS or NFS datastores as storage destinations. The networking configuration of the infra policy selects a distributed port group, thus distributed switches are a requirement. Once again, different infrastructure policies could choose different networks for provisioning the data services; test and dev could use one network, verification and validation could use another whilst production could be a completely different network. Associated with each network is an IP Pool which contains a range of IP Addresses which the vSphere Administrator also configures. Finally, you get control over the size of the VMs which make up the nodes in the K8s cluster. This avoids Data Administrators building out monster VMs for their data services. You, as the vSphere administrator, have full control over the sizes of VMs deployed through a concept called a VM class. While we ship with some cookie-cutter sizes, you could absolutely create your own VM Classes to meet your needs. Thus, you could have many infra policies, each associated with a unique set of vSphere resources.
So how do we deploy a data service? At this point, the work of the vSphere administrator is almost done. The last step that they have to do is to create some users/permissions to access DSM. They would typically create permissions for a Data Administrator or DB Admin. The DB Admin can then log into the DSM portal and has the ability to take care of all day 0 – day 2 operations around data services. In fact, the Data Admin/DB Admin could even create data services and hand ownership off to an end-user/developer. Alternatively, an end-user can also be given access to the DSM portal to create their own data services. Either works, and both are supported and both have valid use cases. So how is a data services tied to an infra policy created earlier? First, lets look at the basic information that is required when building a data service. Again, this workflow may change by the time we get to GA, but not by much. You can see that in this example that we are going to create a PostgreSQL database, the version is 14, and the replica mode is Single vSphere Cluster. The topology is to have 1 primary, 1 replica and 1 monitor in the single vSphere cluster. This will provision a 3 node K8s cluster, and will place each of the components (primary, replica, monitor) as distinct pods on separate K8s nodes. This allows the database to remain available during maintenance and failures.
Let’s now move to the next step where the infrastructure policy is chosen. Once the infra policy is selected, the Data Admin will be offered a drop-down list of storage policies associated with the infra policy. They will also be offered a drop-down list of VM classes which are also associated with this infra policy. These were all defined by the vSphere Administrator previously. The final configurable attribute is the size of the disks that are created for the data service. The disks are implemented as K8s Persistent Volume (PV) on the datastore chosen via the vSphere CSI driver which is installed on the K8s cluster.
The next configuration items relate to backup and maintenance schedules. There are defaults for both, but these of course can be modified as required. Backups are automatically enabled, and take place daily. There is a full backup taken initially, with subsequent backups taken incrementally. The only configuration item is the selection of an object store backup destination. The backup retention policy has a default setting, but is configurable.
DSM includes automated life-cycle management. If there is a new patch released for a particular data service, this will be highlighted in the DSM portal so as to bring it to the attention of the Data Admin. So long as a Maintenance Window was enabled when the data service was deployed, this patching will take place automatically without any interaction from the Data Admin. And if the data service has been provisioned in clustered-mode (not standalone), the data service will be updated in a rolling fashion, avoiding any downtime. There is also various metrics and KPIs gathered from the data services so that the Data Admin can get alerted if anything is outside of the norm. We are also planning to add some troubleshooting suggestions to the alerts in the DSM portal to guide Data Admins on what might be the root cause of an issue.
The final configuration item is Advanced Settings. A big difference in DSM 2.0 is the ability to fine-tune any of the advanced settings associated with a data service. In DSM 1.x, we were quite restrictive in which advanced settings we allowed to be changed. In DSM 2.0, we are supporting any advanced setting changes that you wish to make. With all of this configured, you can now proceed with creating the data service.
Putting it all together, DSM 2.0 might be visualized as something like the following.
One final note – I mentioned already how K8s has a rich API and how this could be leveraged for automation purposes, etc with DSM objects. The same is true with Aria Automation. We plan to have a number of custom resources in Aria Automation which we will be making available to our customers to use. You can then fine-tune these to provide database as a service or even application as a service to your own end users. This is the Aria Automation call-out in the previous diagram. We will provide more information on how to leverage these closer to GA.