A closer look at SpringPath

springpathAnother hyper-converged storage company has just emerged out of stealth. Last week I had the opportunity to catch up with the team from SpringPath (formerly StorVisor), based in Silicon Valley. The company has a bunch of ex-VMware folks on-board, such as Mallik Mahalingam and Krishna Yadappanavar. Mallik and Krishna were both involved in a number of I/O related initiatives during their time at VMware. Let’s take a closer look at their new hyper-converged storage product.

Overview
The SpringPath data platform is a hyper-converged storage solution. It can be considered a hybrid storage solution as it uses a combination of flash devices for caching and magnetic disks for persistent storage to provide a datastore for virtual machine deployments (although they plan on other use-cases, more on this later). The long-term goal, from what I could gather, is that they want to be considered a storage platform for any environment, be it virtual machine workloads, containers or big data.

The hypervisor is ESXi from VMware currently (thus hyper-converged) but there are plans to support other hypervisors such as KVM too. There is a 3 node minimum requirement, and the cluster can be scaled out one node at a time. This can be done without impacting the current cluster, and an automatic rebuild of the storage objects backing the virtual machine takes place, utilizing the new storage to balance the cluster. One of their primary goals is to make deployment as simple as possible.

Architecture
The SpringPath data platform is a pure software solution; they maintain a hardware compatibility list (HCL) where customers can choose a range of servers and storage depending on their requirements. From what I can see, there are a handful of servers supported currently, but again they plan to grow over time.

As mentioned, the servers require both SSD for cache and magnetic disks for capacity. Interestingly but not surprisingly, SpringPath also recommend a 1:10 cache:capacity ratio, as most virtualized workloads have a working set somewhere in the region of 10%.

The software is deployed in the form of a virtual appliance – the SpringPath controller. The virtual appliance is a bit on the large size at 8 vCPUs and 40GB memory, but once you hear about the data services that these guys are supporting, you’ll understand why.

The appliance takes control of the local storage. Under the covers, a distributed storage layer is creating based on their patent pending HALO (Hardware Agnostic Log-structured Object) architecture. Data can be accessed via File, Block, Object or API Plugins.

Once the appliance is deployed, it consumes the local flash and magnetic disks of the ESXi hosts, and presents one or more NFS datastores back to the ESXi hosts for provisioning of VMs (this architecture is reminiscent of Nutanix). Since the resulting datastore is NFS, they also have a VAAI-NAS plugin to accelerate certain storage related activities.

They actually have a very nice deployment UI, with lots of bells and whistles, which makes the deployment of SpringPath data platform extremely simple. Again, this is one of their design goals. Once the cluster is deployed, it can be managed from the vSphere web client since SpringPath have developed their own vCenter web client plugin.

springpath web client

Interesting, VMs are deployed across all the nodes in the cluster. SpringPath does not need affinity between application cache data and persistent data – in other words, caching could be on an SSD on one host, while persistence could be on the magnetic disks of another host. Data distribution is achieved by striping the data across all hosts in the cluster. Their platform makes sure that there is a uniform distribution of data across the entire cluster. In addition, SpringPath data platform re-balances the data automatically when there imbalance in the cluster (due to node or disks going away), but since they support compression and dedupe, this reduces the amount of traffic that is being moved between nodes. Let’s talk about the data services in more detail.

Data Services
As previously mentioned, the platform includes deduplication and compression. All data blocks that are written to the physical storage layer are compressed into objects. Each of these compressed objects are uniquely addressable using keys, with each key finger-printed and stored with checksum to provide data integrity.

The SpringPath data platform also support native snapshots, which can be taken at the VM level or resource pool level as well as at a VM folder level (where the individual VM snapshots are all native snapshots). This is quite a neat feature. Their native snapshots are pointer based snapshot which means that the creation/deletion process is very fast, utilizing redirect on write technology. Because of this native format, there are no issues with snapshot consolidation. This functionality is available via their plugin to the vSphere web client.

Another feature is native clones, which once taken, can be modified and powered on as virtual machines in their own right. Here is a breakdown of the HALO architecture.

HALO architecture
Support
The SpringPath data platform has a call home support feature built-in. Information taken from the host is crunched with spunk-like tools and allows them to offer dashboards to their customers via a cloud based application called SpringPath Support Cloud. This is similar in some respects to Nimble Insight I guess. There is 24x7x365 email support and 8x7x365 phone support. The Springpath data platform is being offered under subscription based pricing model as low as $4K/server/year and this includes licensing for all the features mentioned above as well as support.

Futures
I mentioned earlier that although this solution can be used for Virtual Machines, SpringPath have a bunch of other ideas for their platform going forward.

In the current platform, they are looking at stretch-clustering and disaster recovery solutions. They are not there yet but they do hope to have asynchronous-mode DR within 6 months. they also mentioned that they are working towards a Virtual Volumes implementation.

Outside of vSphere support in the 1.0 product, they also have Beta of SpringPath Data Platform for the following solutions:

  • OpenStack/KVM – provide storage Cinder and Glance
  • OpenStack/Docker Containers – provide NFS storage for containers

Conclusion
This is already a competitive space. Regular readers will see some similarity between the SpringPath data platform and VMware’s Virtual SAN (VSAN). There are also similarities with Nutanix, Simplivity, Maxta and Scale Computing, all players in the hyper-converged space. SpringPath will need additional differentiators to set them apart. Perhaps positioning themselves as a hyper-converged scale-out storage solution for OpenStack and Docker is one such way of doing that. I look forward to see what else they have planned.

Check out the press release here.