First things first – it seems to me that ViPR is all about dealing with those pesky storage issues that we all know about – namely explosive data growth, management complexity (and the problem of building out storage silos for specific applications), and then finally the operations perspective (how can IT be given a portal to do basic storage tasks such as provisioning, extending and deleting storage without engaging the storage admin for these tasks). Those are the storage issues that I see ViPR trying to address.
Abstract & Pool
So how does ViPR achieve this I hear you ask? Well, they do it by following the SDDC mantra of Abstract/Pool/Automate. ViPR can have multiple different storage arrays from both EMC and third parties abstracted away and a single “virtual array” presented to your hosts. Once the physical storage arrays and associated fabrics (both FC & IP) have been discovered and the virtual array has been created, the next step for a storage administrator is to create storage pools. The pools are assigned a storage type (block or file) as well as a provisioning type (thick, thin). The storage admin can also specify protocols, expandable options for the storage, snapshot options and remote protection (e.g. EMC Recover Point). A portal is then offered to the vSphere administrator or end-user to do certain storage related tasks on the pools to which they are granted access, relieving the storage admin from such tasks.
Control Plane & Data Services
ViPR is considered to have both a Control Plane and a set of Data Services. The Control Plane talks to the management interfaces on the physical storage arrays – and the control plane is responsible for provisioning, reporting, extending & snapshot’ing. There is no I/O flow through the control plane and no dependency on the control plane when it comes to block and file storage – loss of control plane has no impact on the data. There is a caveat here when it comes to object storage – more about this shortly.
The other aspect of ViPR are the Data Services. Data Services delivers features that storage arrays do not currently have, for instance an object store. EMC ViPR data services can present object stores from physical arrays that do not support it. As a caveat to the point made earlier about ViPR not sitting in the I/O path for block and file, the same is not true for object storage. In this case, there is a dependency on the control plane. However the control plane is clustered for resilience. We’ll cover this later.
Another possible data service that ViPR could offer is VASA – vSphere APIs for Storage Awareness. If a physical array does not offer VASA, then the ViPR virtual array may be able to do this on its behalf. This would then mean that Storage Profiles could be built based on the capabilities surfaced up to vCenter by VASA.
ViPR gives a vSphere admin/application owner/end-user the ability to provision, snapshot, replicate and delete storage without having to engage a storage admin for the task. ViPR makes available to the end-user a ‘storage service catalog’ of tasks that the end-user can carry out on the storage. However the storage administrator has full control over the tasks a vSphere admin/end-user is allowed to carry out on the storage.
When the vSphere admin provisions storage via the portal, it makes this storage directly visible (map and mount) to the ESXi hosts in the vSphere cluster. You do have to set up physical host assets and provide the appropriate credentials in ViPR for this to work automatically. For instance, you need to provide appropriate HBA WWNs in the case of FC.
EMC have integrated ViPR neatly with a number of VMware products. It has its own dashboard for vCenter Operations Manager as well as integration with vCenter Orchestrator. This integration comes with quite a number of pre-canned orchestrator workflows specifically for ViPR for automation purposes. However, it is not policy driven like VSAN (i.e. you cannot simply create a policy for a VM and ask ViPR to implement it). I’ll come back to this in the conclusion.
Control Plane Details
The control plane is composed of either a 3 node/VM or 5 node/VM cluster. ViPR only comes as an OVA format so you will need some vSphere infrastructure for deployment. At least 3 nodes are required for ViPR to be deployed. Each VM requires 16GB memory, 4 VCPUs and 1TB of disk. The nodes use a Cassandra (NoSQL) database to store data about the infrastructure (not actual data per se).
The Virtual SAN connection
Hopefully this post has given you an idea about what ViPR does and where it can be used. My understanding is that this is a product that would be suitable for medium to large-scale customers who have lots of different arrays and silos which they need to manage and would benefit from a self-service portal for their end-users. What about Virtual SAN? How does this compete or tie into ViPR? Well, as you have read, ViPR provides Software Defined Storage from an Abstract/Pool/Automate perspective but it does not store data itself – in other words it is a virtual storage array made up of multiple physical storage arrays at the back-end. ViPR does not have a data plane for the storing of actual data. Virtual SAN is a storage product which can be used for the deployment of highly available VMs through the use of a distributed datastore made up from the local storage from each of the hosts in the VSAN cluster. You can get a lot more VSAN information from the posts listed here.
Another aspect that differentiates is the use of policies. VSAN allows the provisioning of virtual machine storage based on policy settings. There isn’t any such policy driven storage feature in ViPR. However once VMware launches the long awaited Virtual Volumes product, I’m sure ViPR will be well positioned to leverage this.
It is conceivable that some time in the future, ViPR will tap into Virtual SAN APIs and be able to abstract that away for end-user consumption in much the same way as ViPR does for physical storage arrays right now.
By the way, ViPR 1.1 is due out before the end of the year. My understanding is that this will have support for more storage arrays from both EMC and 3rd parties. I also heard something about HDFS and Hadoop integration. That would be very interesting.