Last year, NetApp announced a new host side cache accelerator feature to compliment their Virtual Storage Tiering (VST) technology. Rather than keeping all your data in flash, VST places hot data in flash while moving cold data to cheaper and slower media. NetApp are offering this as an end-to-end technology, from server to array controller (Flash Cache) to disk pools (Flash Pools). One of the major parts of this is Flash Accel, which was also announced in the latter part of last year, and is the server-side flash component of VST. On the back of their recently announced All Flash Array, NetApp are also making Flash Accel available to the general public.
I was fortunate enough yesterday to get an introduction to QLogic’s new Mt. Rainier technology. Although Mt. Rainier allows for different configurations of SSD/Flash to be used, the one that caught my eye was the QLogic QLE10000 Series SSD HBAs. These have not started to ship as yet, but considering that the announcement was last September, one suspects that GA is not far off. As the name suggests, this is a PCIe Flash card, but QLogic have one added advantage – the flash is combined with the Host Bus Adapter, meaning that you get your storage connectivity and cache accelerator on a single PCIe card. This is a considerable advantage over many of the other PCI cache accelerators on the market at the moment, since these still require a HBA for SAN connectivity as well as a slot for the accelerator.
One of the new features of vSphere 5.1 was the SSD monitoring and I/O Device Management features which I discussed in this post. I was doing some further testing on this recently and noticed that a number of fields from my SSD were reported as N/A. For example, I ran the following command against a local SSD drive on my host and these were the statistics returned.
There is no doubt that Flash is hot right now. Over the past number of months, we have seen IBM acquire Texas Memory Systems (TMS), HDS unveil their own flash strategy and HP launch their all flash 3PAR P1000 array. Of course regular readers of my blog will have seen my posts about newer all flash array vendors such as Pure Storage, Violin Memory & Nimbus Data. The purpose of this post is to highlight XtremIO’s flash storage solution which was recently acquired by EMC.
I should point out that there is no XtremIO product available for purchase just yet. My understanding is that EMC hope to go GA with it sometime next year (but please check with EMC directly). Don’t go looking for XtremIO on the VMware HCL – you won’t find it.
At VMworld 2012 in Barcelona, I had the pleasure of meeting & chatting with Josh Goldstein of XtremIO. Josh was kind enough to give me a preview of XtremIO’s technology. He told me that XtremIO provide an all-flash storage array based on scale-out building blocks called “X-bricks” which cluster together to meet customer demands. The interconnect between X-bricks is Infiniband. The LUNs are presented to ESXi hosts over iSCSI or Fibre Channel.
The “X-brick” comes in a configuration of two stand-alone controllers with a JBOD of 2.5″ form factor MLC flash SSD drives. It has automatic load balancing across X-bricks, which Josh said provided a consistent low latency as well as linear performance scalability as you add additional X-bricks. XtremIO’s core engine, implemented 100% in software, does inline deduplication of the data (4KB granular) and works across all of the volumes on all of the X-bricks in a cluster. XtremIO claim that they can achieve between a 5:1 and a 30:1 data reduction in virtualized environments. The added benefit of this inline, global deduplication is that it reduces the number of flash writes, reducing wear on the SSDs.
All volumes on the X-brick are protected with a flash optimized configuration similar to RAID-6, but XtremIO has patented algorithms to minimize reads and writes as well as providing flash longevity and improving performance. It should be noted that this is something which is automatically done, and does not require configuration by the administrator. All volumes are also all thinly provisioned.
Another key point is that the X-brick has no single point of failure. The cluster’s system manager function handles failures, and can restart various software components or fail them over.
Snapshots and clones are done at the LUN level, and work at a 4KB grain size. XtremIO claim that their snapshots and clones have no impact on performance, and can run as well as their LUNs. However, considering EMC participation in the Virtual Volumes (VVOLs) program, one suspects that granularity will move to the virtual machine or VMDK level at some point in the future.
There is no in-built replication mechanism on the X-Brick at this time, but VMs running on the XtremIO X-brick can of course be replicated using VMware’s own vSphere Replication Product. Although Josh could not go into specifics or roadmap details, he did state that a native replication feature is a high priority item for them.
Management plugin to vCenter
Management is currently done via a web-based UI and Command Line Interface at the moment. A single admin UI screen allows you to monitor capacity, performance, alerts and hardware status. Integration with the vSphere UI is something XtremIO are currently looking at. As you can see, the UI is very simplified and at a single glance you can get an overall view of the health and performance of the X-brick cluster (unfortunately, the storage was idle at the time that this screen shot was captured, but hopefully you get the idea).
I asked Josh what were the features of the XtremIO X-brick that he believes make it stand out from the other flash array vendors on the market currently. These were the items he highlighted as being differentiators.
1. XtremIO’s dedupe is truly real-time & inline at all times. It is not semi-inline (sometimes switched off for future post-processing when the array gets busy) nor a post-processing design. This has the benefit of reducing the number of writes seen by the SSDs, which both increases flash endurance and delivers better performance, since I/O cycles on the SSDs remain available for writing unique data.
2. XtremIO’s VAAI XCOPY is greatly enhanced by having real-time inline deduplication and by the fact that the XtremIO array always maintains its metadata tables in memory, rather than having to perform lookups on disk. Imagine what happens when an administrator clones a VM. VAAI tells the XtremIO array to copy (using the XCOPY command) the range of blocks corresponding to that VM from one location in the array to a new location. Since the VM already exists on the array (and thus no new unique blocks are being written), all the XtremIO system has to do is update its metadata tables that a new reference to those blocks exists. With all the metadata in RAM, the operation can be completed practically instantaneously. This gives administrators tremendous power and flexibility to roll out VMs on-demand without incurring a high I/O penalty on the storage array. XtremIO have a video demonstration of this capability on their website here.
3. From a vSphere perspective you can get the full benefits of eager thick zero provisioning all the time since the volumes are always thin provisioned at the back-end. And since the XtremIO array supports the VAAI zero blocks/write-same primitive and has special internal handling of zero blocks, there is no drawback in provisioning or formatting time for eager thick zero volumes.
4. With its deduplication technology, XtremIO also make Full Clones attractive and cost-effective to run in all-flash, especially for VDI where storage can sometimes be cost prohibitive. This benefit is not exclusive to Full Clones – the XtremIO array works equally well with Linked Clones or any combination of Full and Linked Clones. An interesting report published jointly between VMware and XtremIO about VDI testing results can be found here. There is also an interesting VDI demonstration video here.
5. When you take all of these things together (advantages of inline dedupe, VAAI, eager zero thick all the time, no RAID configuration on the array), Josh stated that one could become a “lazy administrator”. All of the labour intensive operations that required administrators full attention are now taken care of by the array. And since everything is guaranteed to run at the same performance level (LUNs, snapshots, clones), there is no performance management necessary at the array level. The configuration steps are very simple – you create the volumes, create initiator groups and map volumes to the initiators. Very simplistic.
The X-Brick also comes with a complete set of CLI commands for management and monitoring as XtremIO realise that a lot of administrators like to work at the command line level for scripting and automation. This is nice to see.
My colleague Andre does a really nice in-depth review of the X-brick in a VMware View POC here.
This flash storage solution certainly has a lot of neat features. What with the number of storage vendors who are now embracing flash, and the number of new flash-centric storage vendors on the scene, 2013 should be an interesting year in storage.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage
At VMworld 2012 in San Francisco, I had the pleasure of catching up with Scott Kline, Karthik Pinnamaneni & the rest of the team from Nimbus Data. In the weeks leading up to VMworld I read quite a bit about Nimbus Data’s new Gemini Flash Array, but my primary interest was to figure out what integration points existed with vSphere.
Let’s start with a look at the Gemini Flash Array. The first thing that jumps out is that there is multiple protocols supported for both SAN & NAS. The array supports Fibre Channel, iSCSI, NFS, SMB and Infiniband protocols. There is no FCoE support at this time, and when I asked the guys why, they said that this is simply due to lack of demand. There is nothing that would prevent them implementing FCoE if there was sufficient demand for it, which they are not seeing right now.
An interesting fact is that Nimbus Data manufacture their own proprietary solid state drives. They purchase the NAND and build the drives themselves. There is a reason for this. One point that Scott and Karthik made to me was that many scale out storage offerings do not scale out their cache with their arrays. This then becomes the bottleneck. Nimbus Data address this by placing cache on each of their drives so as the storage scales out, so does the cache. They refer to this as their Distributed Cache Architecture (DCA).
The ‘secret-sauce’ at the heart of the Nimbus array is the HALO operating system. It provides administration, data protection, optimization, security, and monitoring of Nimbus Data arrays. The Nimbus Data array presents a single SSD device back to the ESXi host(s), either via a block protocol or NFS. Nimbus Data claim that their newer Gemini model can achieve 1.2 million IOPS in a 2U box. This is a latency of only 100 microseconds. Yes, that is 0.1 millisecond latency. The I/O block size used to achieve this figure was 4K, with 80% read & 20% write. They were also able to sustain a 12GB throughput with a 256K block size.
One of the concerns many people have with flash is the lifespan. Nimbus Data are offering 10 year endurance with their drives. There are a number of thing they do to mitigate the wear out of their drives. One thing they do is cache the writes in DRAM. Once there is a full 64KB of writes in the cache, they do a full page write to Flash. Nimbus Data also have an algorithm which chooses between the individual flash cells. Each of the cells are rated, and the algorithm will choose the cells which have a higher rating over cells with a lower rating. All of these contribute to the MLC (Multi Level Cell) flash drives lasting the guaranteed 10 years. In fact, Scott told me that 2 years ago they deployed Nimbus Data Flash Arrays at eBay and the flash drives in these arrays have not yet reached 10% usage.
Nimbus Data currently support all three VAAI Block Primitives – ATS (Atomic Test & Set), Write Same (Zero) and XCOPY (Clone). They are working on VAAI-NAS primitives but these are not available yet. The driving factor here of course is the VCAI offload – the ability to offload linked clones to the storage array for View Desktop deployments.
Scott also told me that they are working on a management plugin for the new vSphere 5.1 web based client, but it wasn’t available for VMworld 2012. Right now the management is done by an external web based management tool. However I am led to believe that Nimbus Data will have a vCenter plugin for their management tool sometime in Q4 2012.
Business Continuance/Disaster Recovery
The Gemini array is designed to be Fault Tolerant and replication can be configured in either synchronous or asynchronous mode. Snapshots and replication currently work at the volume level. There is no integration with VMware Site Recovery Manager at this time. This is something Nimbus Data are hoping to have in place in the first half of 2013.
Overall, this is an amazing piece of technology. I would like to see even more integration with vSphere products and features going forward, as I personally think that this is a major differentiation factor in the storage market. Still, over 1 million IOPs in a 2U box – impressive stuff.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage