I had an interesting question the other day about whether Raw Device Mappings (aka RDMs) still had a reliance on the LUN ID, especially when it comes to the vMotion of Virtual Machines which have RDMs attached. I remember some time back that we introduced a concept called Dynamic Name Resolution for RDMs, which meant that we no longer relied on a consistent HBA number or even the path to identify the RDM, but do we still use the LUN ID?
To actually find a reference to the requirement to keep the LUN ID consistent across hosts I had to go back to the Fibre Channel SAN Configuration Guide which we shipped with ESX 4.1. In it, it explicitly states “To use RDMs successfully, a given LUN must be presented with the same LUN ID to every ESX/ESXi host in the cluster.” However this guideline only appeared under the EMC & IBM sections of the guide. And I couldn’t find anything in the 5.x documentation.
To make sure that nothing changed around this in 5.x, I did a bit of investigation.I used my NetApp array to present a LUN to two of my ESXi hosts, but used a different LUN ID for each presentation (ID 40 to one host and ID 50 to another host):
I scanned my SAN, and I could see the LUN on each host, but with different LUN IDs.
The VMFS volume on which my VM was deployed was shared to both hosts. I then proceeded to add the Raw Device Mapping to the Virtual Machine. The VM was on host 1, so the RDM had a LUN ID of 50. Then I looked in the RDM meta data file but there was nothing directly in there which references the LUN ID of the RDM. Although the UUID does look suspiciously like part of an NAA ID (another SCSI identifier mechanism), there is definitely no LUN ID reference.
I next used the vmkfstools -q command to look at the mapping:
# vmkfstools -q WinXP-Lite_1.vmdk
Disk WinXP-Lite_1.vmdk is a Non-passthrough Raw Device Mapping
Maps to: vml.020032000060a98000572d54714e346d63444b44744c554e202020
So the RDM maps to a very long VML number. But what is the VML, and how is it generated? VML which is short for VMware Legacy. This is using a combination of Controller, Target, Channel & Lun information, as well as SCSI id & vendor specific info to identify the LUN. We can parse up the VML as follows:
- CTL info – 0200320000 (the 32 here is hex for LUN ID 50 – my RDM)
- NAA id – 60a98000572d54714e346d63444b4474
- Vendor - 4c554e202020 (HEX -> ASCII converts this to ‘LUN’ on NetApp; it differs from array to array)
So, yes, even though the metadata file itself does not have a LUN id reference, it seems that because we are using the VML as a mapping reference which still a reliance on LUN ID.
I now wanted to see the effect this would have on a vMotion operation, so I tried to migrate my Virtual Machine to the other host which had the RDM presented as LUN ID 40 instead of 50. The vMotion operation failed the compatibility check as follows:
And just for kicks, I searched for that error. First hit was KB 1016210. And inside in this KB (which elaborates on the VML layout), you will find the following statement:
To resolve this issue, LUN presentation should be made consistent for every host participating in a cluster that could run the virtual machine, the raw device mapping metadata file should be consistent with that presentation, and vCenter Server’s cache of this information should be accurate.
I think that’s pretty conclusive, don’t you? To finish, I went back to the array and had it present the LUN to all hosts with a matching LUN ID. I was then able to successfully vMotion the Virtual Machine with an RDM between ESXi hosts.
Bottom line – yes, RDMs still have a reliance on LUN IDs matching across all hosts, even in vSphere 5.x.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage