EMC XtremIO revisited – a chat with Vinay Gaonkar

xtremeio-logoOn a recent trip to VMware in Palo Alto, I found some time to visit with a good pal of mine, Vinay Gaonkar, who is now the Product Manager for XtremIO over at EMC. Vinay used to be a storage PM at VMware (he worked on the initial phases of VVols), and we worked together on a number of storage items in various vSphere releases. It’s been almost 2 years since I last spoke to the XtremIO folks (VMworld 2012 in fact, when the product still had not become generally available), so I thought that this would be a good time to catch up with them, as we are in the run up to VMworld 2014.

Quick Overview

The original overview which I did can be found here, so I’ll try not to repeat what is in that post too much. Suffice to say that EMC XtremIO is an all flash array (AFA) which comes in an active/active scale-out building-block architecture called X-Bricks. Currently the cluster can be scaled out to 4 node configurations, although Vinay mentioned that they are actively looking to increase this by end of year. Connectivity between the X-Bricks is infiniband, and the system automatic re-balances when more X-bricks are added.

Each X-Brick comes in a dual controller configuration with 25 x eMLC SSDs. There are currently three X-Brick models, which come with either 5TB, 10TB or 20TB depending number and size of SSDs used. You can expand the 5TB X-Brick non-disruptively to 10TB by adding additional SSDs.

With that in mind, lets look at some of the other details.

Garbage Collection – How is flash managed?

I had a very interesting conversation with Vinay about this. In Vinay’s opinion, if an All Flash Array does system level garbage collection, which is essentially defragmentation of the flash in the background, it can have a huge impact in performance when trying to create contiguous space. It may not be such a big issue when a small amount of free space is being used, but when flash starts to fills up, performance of your array can become a real concern as it tries to make space available for incoming writes which consumes precious controller resources. Some arrays reserve space on their arrays for garbage collection too, so although you purchase X amount of storage, you may not have that amount of usable storage because of this overhead.

XtremIO do not use system level garbage collection – they in fact rely on the SSDs to do it. The eMLC (enterprise Multi-Level Cells) SSDs used by XtremIO provide their own garbage collection. This means that garbage collection can be done on a per drive bases, rather than across all drives at the same time, which is what some of their competitors do. This, along with their flash optimized storage features that we will look at shortly, is a big differentiation for XtremIO over other AFA vendors, according to Vinay.

RAID & Failure Handling

I asked how XtremIO handles SSD failures. The RAID type is XDP, and with 25 drives, the configuration is 23+2 dual parity (8% overhead) which can tolerate multiple SSD failures. With that in mind, failed drives are rebuilt using the parity information. Vinay stated that XDP is optimized to reduce flash writes and rebuild times and to increase IO performance.

Vinay also mentioned how, because they use eMLC over cMLC (consumer level drives), there is a much larger difference in endurance between cMLC & eMLC.

Deduplication

XtremIO uses Content Addressing (SHA-1 hash of the content) for data layout and placement. Because of this, no two identical blocks are written to flash. Deduplication comes for free and is inline/always on. No post processing or additional overhead.

Compression

Support for compression was announced recently and my understanding is that this will be available shortly. Vinay understands that compression is a very important feature to have, especially for databases because there may be little dedupe can do in such cases (only when you backup or clone perhaps).

Encryption

Encryption is already there – the eMLC SSDs which XtremIO use today are Self-Encrypting SSDs (SEDs). Provide the SSD with the key that you need to use. XtremIO has a built-in key management system, and there is no overhead for encryption.

Scale-Out

We mentioned that X-Bricks can be scaled out, and that automatic re-balance will take place. The re-balance operation should be around a few hours (depending on the total allocated data) but today the introduction of a new X-brick is disruptive.

TP, VAAI, Snapshots

XtremIO offers Thin Provisioning capabilities, the full complement of VAAI block primitives and an extremely powerful snapshot mechanism. Version 2.4 of their product offers 8K total snapshots per cluster. This includes snapshot depth (snapshot of a snapshot). When a snapshot is created, existing metadata and data is shared for production and snapshot. Snapshots use a re-directed-on-write technique which incurs no overhead from a capacity and performance perspective, which Vinay feels is another unique feature of their array. XtremIO snapshots are fully fledged (writeable).

 Replication

There is no native replication yet, but there is a plan to implement native replication in the near future.  VMware’s vSphere Replication, which does not require native replication, can be used at the moment. Another option is EMC VPLEX. Native replication will leverage both the existing dedupe capability and the planned compression capabilities to optimize the replication process.

VMworld 2014

EMC are a Gold Diamond Sponsor at this year’s VMworld 2014. I’m sure XtremIO will be showcased a lot at the show. If you are in the market for an All Flash Array that has lots of data services and features, pop by the booth and speak to Vinay who will be happy to fill you in with the details.