Tech Preview of EMC’s XtremIO Flash Storage Solution
There is no doubt that Flash is hot right now. Over the past number of months, we have seen IBM acquire Texas Memory Systems (TMS), HDS unveil their own flash strategy and HP launch their all flash 3PAR P1000 array. Of course regular readers of my blog will have seen my posts about newer all flash array vendors such as Pure Storage, Violin Memory & Nimbus Data. The purpose of this post is to highlight XtremIO’s flash storage solution which was recently acquired by EMC.
I should point out that there is no XtremIO product available for purchase just yet. My understanding is that EMC hope to go GA with it sometime next year (but please check with EMC directly). Don’t go looking for XtremIO on the VMware HCL – you won’t find it.
At VMworld 2012 in Barcelona, I had the pleasure of meeting & chatting with Josh Goldstein of XtremIO. Josh was kind enough to give me a preview of XtremIO’s technology. He told me that XtremIO provide an all-flash storage array based on scale-out building blocks called “X-bricks” which cluster together to meet customer demands. The interconnect between X-bricks is Infiniband. The LUNs are presented to ESXi hosts over iSCSI or Fibre Channel.
The “X-brick” comes in a configuration of two stand-alone controllers with a JBOD of 2.5″ form factor MLC flash SSD drives. It has automatic load balancing across X-bricks, which Josh said provided a consistent low latency as well as linear performance scalability as you add additional X-bricks. XtremIO’s core engine, implemented 100% in software, does inline deduplication of the data (4KB granular) and works across all of the volumes on all of the X-bricks in a cluster. XtremIO claim that they can achieve between a 5:1 and a 30:1 data reduction in virtualized environments. The added benefit of this inline, global deduplication is that it reduces the number of flash writes, reducing wear on the SSDs.
All volumes on the X-brick are protected with a flash optimized configuration similar to RAID-6, but XtremIO has patented algorithms to minimize reads and writes as well as providing flash longevity and improving performance. It should be noted that this is something which is automatically done, and does not require configuration by the administrator. All volumes are also all thinly provisioned.
Another key point is that the X-brick has no single point of failure. The cluster’s system manager function handles failures, and can restart various software components or fail them over.
Snapshots/Clones
Snapshots and clones are done at the LUN level, and work at a 4KB grain size. XtremIO claim that their snapshots and clones have no impact on performance, and can run as well as their LUNs. However, considering EMC participation in the Virtual Volumes (VVOLs) program, one suspects that granularity will move to the virtual machine or VMDK level at some point in the future.
Replication/SRM
There is no in-built replication mechanism on the X-Brick at this time, but VMs running on the XtremIO X-brick can of course be replicated using VMware’s own vSphere Replication Product. Although Josh could not go into specifics or roadmap details, he did state that a native replication feature is a high priority item for them.
Management plugin to vCenter
Management is currently done via a web-based UI and Command Line Interface at the moment. A single admin UI screen allows you to monitor capacity, performance, alerts and hardware status. Integration with the vSphere UI is something XtremIO are currently looking at. As you can see, the UI is very simplified and at a single glance you can get an overall view of the health and performance of the X-brick cluster (unfortunately, the storage was idle at the time that this screen shot was captured, but hopefully you get the idea).
I asked Josh what were the features of the XtremIO X-brick that he believes make it stand out from the other flash array vendors on the market currently. These were the items he highlighted as being differentiators.
1. XtremIO’s dedupe is truly real-time & inline at all times. It is not semi-inline (sometimes switched off for future post-processing when the array gets busy) nor a post-processing design. This has the benefit of reducing the number of writes seen by the SSDs, which both increases flash endurance and delivers better performance, since I/O cycles on the SSDs remain available for writing unique data.
2. XtremIO’s VAAI XCOPY is greatly enhanced by having real-time inline deduplication and by the fact that the XtremIO array always maintains its metadata tables in memory, rather than having to perform lookups on disk. Imagine what happens when an administrator clones a VM. VAAI tells the XtremIO array to copy (using the XCOPY command) the range of blocks corresponding to that VM from one location in the array to a new location. Since the VM already exists on the array (and thus no new unique blocks are being written), all the XtremIO system has to do is update its metadata tables that a new reference to those blocks exists. With all the metadata in RAM, the operation can be completed practically instantaneously. This gives administrators tremendous power and flexibility to roll out VMs on-demand without incurring a high I/O penalty on the storage array. XtremIO have a video demonstration of this capability on their website here.
3. From a vSphere perspective you can get the full benefits of eager thick zero provisioning all the time since the volumes are always thin provisioned at the back-end. And since the XtremIO array supports the VAAI zero blocks/write-same primitive and has special internal handling of zero blocks, there is no drawback in provisioning or formatting time for eager thick zero volumes.
4. With its deduplication technology, XtremIO also make Full Clones attractive and cost-effective to run in all-flash, especially for VDI where storage can sometimes be cost prohibitive. This benefit is not exclusive to Full Clones – the XtremIO array works equally well with Linked Clones or any combination of Full and Linked Clones. An interesting report published jointly between VMware and XtremIO about VDI testing results can be found here. There is also an interesting VDI demonstration video here.
5. When you take all of these things together (advantages of inline dedupe, VAAI, eager zero thick all the time, no RAID configuration on the array), Josh stated that one could become a “lazy administrator”. All of the labour intensive operations that required administrators full attention are now taken care of by the array. And since everything is guaranteed to run at the same performance level (LUNs, snapshots, clones), there is no performance management necessary at the array level. The configuration steps are very simple – you create the volumes, create initiator groups and map volumes to the initiators. Very simplistic.
The X-Brick also comes with a complete set of CLI commands for management and monitoring as XtremIO realise that a lot of administrators like to work at the command line level for scripting and automation. This is nice to see.
There is a really nice in-depth review of the X-brick in a VMware View POC here.
This flash storage solution certainly has a lot of neat features. What with the number of storage vendors who are now embracing flash, and the number of new flash-centric storage vendors on the scene, 2013 should be an interesting year in storage.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @CormacJHogan
It sounds a lot like WHIPTAIL’s arrays which have been shipping in volume for 4 years
Which reminds me that I must catch up with you and Darren sometime soon Mike 🙂
I’m curious to understand how they solved the split brain scenario that can happen on multi node storage. In other solutions I’ve seen a separate and dedicated witness, like the FOM on LeftHand. None of the new scale-out storage coming out (like SolidFire for example) talk about this issue, or they say they have no SPOF or they run as a pure peer-to-peer system, and I’m not sure they can without a witness.
Or simply they have another design working around the witness issue…
Luca, Nutanix’s scale-out architecture uses a Paxos implementation to choose a leader between 3 (or 5 or 7) cluster manager instances. Network partitioning is avoided with a quorum. I’m guessing they’ve something similar.
Thanks for clarification, your numbers mean we always need odd number of nodes on these systems like yours?
Odd number of nodes to run the cluster manager ensemble (service). The total # of nodes in the cluster can be a superset, and even.
BTW, there are no special nodes to run the cluster manager ensemble. The service overlays on the same N controllers serving NFS and iSCSI.
Luca,
The XtremIO system uses nodes that are clustered together using a redundant Infiniband fabric. Every node is physically cabled to two separate Infiniband switches. The Infiniband fabric is part of the system and is managed by XtremIO like all the other components in the system (controllers, SSD shelves, SSDs) – think of it as the system’s backplane.
Split brain issues are not a concern because (1) we are not a loosely coupled system operating over a network fabric that is outside our control and that could become congested or fail in ways we cannot predict (2) our system’s nodes do not operate independently, but in an aggregated, coordinated fashion. There is no opportunity for a node or multiple nodes to become disconnected from the cluster and continue serving data that could be stale.
As an enterprise storage vendor, we obviously go through great effort to design a system with no single-points-of-failure. The XtremIO system continues to serve and protect data reliably even in the face of many types of rare multiple simultaneous faults.
Guys, thanks so much to all for the further explanations, I’ve always been really fascinated by scale-out systems, and these sounds like really promising products.
What is the difference between XtremIO and Whiptail? It sounds like a Whiptail array based on the description.
Yes, there are similarities. Both are all flash arrays and both have nice vSphere integration features. However, I don’t believe EMC are shipping the XtremIO product just yet (one suspects it will be available sometime in 2013) whereas WhipTail already are. I would guess that it will now just boil down to how much performance and capacity can I get for my buck. I always look at the vSphere integration points (VAAI, Management integration, DR via SRM, etc) when looking at these product. If I was a VMware customer, these are the things I would be looking at.
Finally a flash “sweet spot”! Regarding differences between Whiptail and XtremeIO consider write wear and usable capacity. Everyone is writing software and firmware to reduce the impact of write wear so all the good vendors do this pretty well. One major difference is that XtremeIO adds deduplication giving customers two things, less write wear and more usable capacity. I envision XtremeIO and Isilon dominating many datacenters of the future but regardless EMC certainly has an answer for everyone’s storage challeges. Looking forward to GA!