As many of you are aware, VMware made a number of announcements at VMworld 2012. There were three technical previews in the storage space. The first of these was on Virtual Volumes (vVOLS), which is aimed at making storage objects in virtual infrastructures more granular. The second was Distributed Storage, a new distributed datastore using local ESXi storage. The final one was Virtual Flash (vFlash). However, rather than diving into vFlash, I thought it might be more useful to take a step back and have a look at flash technologies in general.
Tag Archives: SSD
WHIPTAIL Announce 4.1.1 Update
Last week, I presented at the UK National VMUG. I took the opportunity to catch up with Darren Williams (Technical Director, EMEA & APAC) of WHIPTAIL who was also presenting at the event. My first introduction to WHIPTAIL came last year when I first met Darren at another user group meeting, and I posted about their XLR8R array on the vSphere storage blog. Darren & I discussed the changes which WHIPTAIL has undergone in the past 12 months since we last spoke, including the launch of a new range of scale out storage arrays, as well as the new features in WHIPTAIL’s soon the be released 4.1.1 update.
vSphere 5.1 SSD Monitoring reporting N/A for certain fields
One of the new features of vSphere 5.1 was the SSD monitoring and I/O Device Management features which I discussed in this post. I was doing some further testing on this recently and noticed that a number of fields from my SSD were reported as N/A. For example, I ran the following command against a local SSD drive on my host and these were the statistics returned.
Nimbus Data’s new Gemini Array & vSphere Integration
At VMworld 2012 in San Francisco, I had the pleasure of catching up with Scott Kline, Karthik Pinnamaneni & the rest of the team from Nimbus Data. In the weeks leading up to VMworld I read quite a bit about Nimbus Data’s new Gemini Flash Array, but my primary interest was to figure out what integration points existed with vSphere.
Gemini Array
Let’s start with a look at the Gemini Flash Array. The first thing that jumps out is that there is multiple protocols supported for both SAN & NAS. The array supports Fibre Channel, iSCSI, NFS, SMB and Infiniband protocols. There is no FCoE support at this time, and when I asked the guys why, they said that this is simply due to lack of demand. There is nothing that would prevent them implementing FCoE if there was sufficient demand for it, which they are not seeing right now.
An interesting fact is that Nimbus Data manufacture their own proprietary solid state drives. They purchase the NAND and build the drives themselves. There is a reason for this. One point that Scott and Karthik made to me was that many scale out storage offerings do not scale out their cache with their arrays. This then becomes the bottleneck. Nimbus Data address this by placing cache on each of their drives so as the storage scales out, so does the cache. They refer to this as their Distributed Cache Architecture (DCA).
The ‘secret-sauce’ at the heart of the Nimbus array is the HALO operating system. It provides administration, data protection, optimization, security, and monitoring of Nimbus Data arrays. The Nimbus Data array presents a single SSD device back to the ESXi host(s), either via a block protocol or NFS. Nimbus Data claim that their newer Gemini model can achieve 1.2 million IOPS in a 2U box. This is a latency of only 100 microseconds. Yes, that is 0.1 millisecond latency. The I/O block size used to achieve this figure was 4K, with 80% read & 20% write. They were also able to sustain a 12GB throughput with a 256K block size.
Flash Longetivity
One of the concerns many people have with flash is the lifespan. Nimbus Data are offering 10 year endurance with their drives. There are a number of thing they do to mitigate the wear out of their drives. One thing they do is cache the writes in DRAM. Once there is a full 64KB of writes in the cache, they do a full page write to Flash. Nimbus Data also have an algorithm which chooses between the individual flash cells. Each of the cells are rated, and the algorithm will choose the cells which have a higher rating over cells with a lower rating. All of these contribute to the MLC (Multi Level Cell) flash drives lasting the guaranteed 10 years. In fact, Scott told me that 2 years ago they deployed Nimbus Data Flash Arrays at eBay and the flash drives in these arrays have not yet reached 10% usage.
VAAI Integration
Nimbus Data currently support all three VAAI Block Primitives – ATS (Atomic Test & Set), Write Same (Zero) and XCOPY (Clone). They are working on VAAI-NAS primitives but these are not available yet. The driving factor here of course is the VCAI offload – the ability to offload linked clones to the storage array for View Desktop deployments.
Scott also told me that they are working on a management plugin for the new vSphere 5.1 web based client, but it wasn’t available for VMworld 2012. Right now the management is done by an external web based management tool. However I am led to believe that Nimbus Data will have a vCenter plugin for their management tool sometime in Q4 2012.
Business Continuance/Disaster Recovery
The Gemini array is designed to be Fault Tolerant and replication can be configured in either synchronous or asynchronous mode. Snapshots and replication currently work at the volume level. There is no integration with VMware Site Recovery Manager at this time. This is something Nimbus Data are hoping to have in place in the first half of 2013.
Overall, this is an amazing piece of technology. I would like to see even more integration with vSphere products and features going forward, as I personally think that this is a major differentiation factor in the storage market. Still, over 1 million IOPs in a 2U box – impressive stuff.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage
vSphere 5.1 Storage Enhancements – Part 6: IODM & SSD Monitoring
To build on 5.0 enhancements to make the life of a vSphere administrator easier from a storage perspective, vSphere 5.1 includes additional command for the diagnosis of various storage protocol issues from the ESXi host. This new functionality is called I/O Device Management (IODM).
This new namespace of esxcli commands includes Fibre Channel, FCoE, iSCSI, SAS Protocol Statistics as well as SMART (Self Monitoring, Analysis And Reporting Technology) attributes. The aim is to allow administrator determine if a storage issue is occurring at the ESXi, HBA, Fabric and Storage Port level. The commands will enable an admin to look at critical events like frame loss, as well as initiate various resets of the storage infrastructure. The SMART features are very useful as it allows insight into SAS and SATA SSD status, such as the current Wear Leveling state of a drive.
Advanced I/O Device Management – esxcli storage san
There are a number of new namespaces in the 5.1 version of esxcli. There is also a new VMkernel module that instrumented drivers can call into, which includes event caching information.
For example, link down and link up messages from Fiber Channel are logged.The fc (fibre channel) namespace also includes an option to perform a LIP (Loop Initiation Primitive) Reset to a given FC adapter on the system. These esxcli commands will also be hooked into vm-support.
Probably one of the nicest parts of this feature is the ability to examine various adapter statistics. This should really assist when trying to troubleshoot storage issues from a vSphere perspective. Here we can see the statistics returned by IODM for a Software iSCSI initiator on an ESXi host. Information such as the number of connections and sessions can help troubleshoot port binding and multipathing configurations on the hosts, and the amount of I/O plus the different types of Protocol Data Units (PDUs) are displayed in a very clear way.
This is a very useful thing to have when trying to monitor your iSCSI infrastructure.
SSD Monitoring
As SSD disks become more prevalent, it is important to be able to monitor them from an ESXi host. VMware is providing a module which will monitor a number of different SSD attributes. This includes the Media Wearout indicator, as well as the temperature & Reallocated Sector Count. The reserved sector count should be about 100, but when the disk surface has issues, SSD allocates sectors from reserved sectors. When these goes to zero, we could start getting sector errors on the SSD, so we need to be aware of any use of the reallocated sectors.
To look at the SSD attributes, the following esxcli command can be used:
esxcli storage core device smart get -d naa.xxxxxx
What we see here is the output of a number of different SSD attributes, including the three mentioned previously.
The plug-ins will live on the ESXi host in the directory /usr/lib/VMware/smart_plugins. VMware is providing a generic SMARTS plugin in 5.1, but disk vendors can provide their own smart plug-in for additional information.
Smartd is the SMART daemon on the ESXi 5.1 host. It runs every half hour & makes API calls to gather useful diagnostic information from the drives. These events and statistics will not be surfaced up into vCenter in vSphere 5.1. They will only be viewable via the esxcli command line. Although the primary use case is for SSD, the esxcli commands can also be run against HDD to gather certain information.
A script called smartinfo.sh gathers statistics from all disks, SSD or not. This information will also be included in the vm-support log gathering utility output.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage
