Handling VSAN trace files when ESXi boots from a flash device
I’ve been involved in a few conversations recently regarding how VSAN trace files are handled when the ESXi host that is participating in a VSAN cluster boots from a flash device. I already did a post about some of these considerations in the past, but focused mostly on USB/SD. However SATADOM was not included in this discussion, as we did not initially support SATADOM in VSAN 5.5, and only announced SATADOM support for VSAN 6.0. It seems that there are some different behaviors that need to be taken into account between the various flash boot devices, which is why I decided to write this post.
Let’s start with ESXi hosts that are booting from either USB sticks, or SD cards. I’m grouping these together since the considerations are more or less the same from a VSAN trace perspective. As outlined in the previous post, when an ESXi host booting from one of these devices is also running VSAN, VSAN traces are written to a RAM disk. Since the RAM disk is non-persistent, these logs are written to persistent storage either during host shutdown or on system crash (PANIC). This means that the VSAN traces, which are typically quite write intensive, do not burn out the boot media. This method of first writing the traces to RAM disk and later moving them to persistent store is handled automatically by the ESXi host and there is no user action required. This is the only support method of handling VSAN traces when booting an ESXi from either a USB stick or an SD card. You cannot write VSAN traces directly to SD or USB boot devices at this time.
This brings us onto another flash device known as SATADOMs. SATADOMs, short for Serial ATA Disk on Modules, are basically flash memory modules designed to be inserted into the SATA connector of a server. In VSAN 6.0, ESXi hosts running VSAN are supported when booting from SATADOM, as long as they met specific requirements. On ESXi hosts that boot from SATADOM, the VSAN traces are written directly to the SATADOM. In other words, there is no RAM disk involved. This is why SATADOM specification requirements for SATADOM were documented in the VSAN 6.0 Admin Guide, and the requirement is for an SLC (single level cell) device. The SLCs have higher endurance and quality when compared to other flash devices. the reason for this is once again to prevent any sort of burn-out occurring on the boot device when trace files are being written to it.
Customers who wish to boot their ESXi hosts (participating in VSAN) from a flash device should trade-off the above considerations with VSAN traces on USB, SD versus the cost of a high level SATADOM.
Hope that explains the differences. I have an older post here on SLC/MLC/eMLC if you wish to learn more.
Hi Cormac,
Thanks very much for this post! We dodged this bullet (just barely) by deploying on hosts with 512GB RAM. Per the design guide we were “cleared” to use SD to boot. In the future, if we were to upgrade our RAM, I believe the only option we’d have is to use something like a SATADOM.
Can you share why the Design Guide recommends that hosts with >512GB RAM should not boot from SD? What’s the worst thing that could happen?
It would also be great to understand what VSAN traces are. It’s clear they contain low-level information about what VSAN is up to and that they are important to VMware Suport. Can they be useful to customers who want to dig around a bit?
The 512GB limit is basically due to PSOD sizes – more information in this post: http://cormachogan.com/2015/02/24/vsan-considerations-when-booting-from-usbsd/
VSAN traces are basically trace files that are capturing all of VSAN’s activity. Not sure that they would be of much use to customers to be honest; they’re role is to help engineering trace back through a sequence of events leading up to a particular issue.
Thanks very much for the PSOD lead. I didn’t know quite what the rationale for that >512GB “limitation” was.
The link to your other blog posts seems to indicate that it is a concern about having enough capacity on the SD card for the traces in the event of a crash.
I guess I’m just thinking that if a customer had a large SD card (say 32GB or greater) would that concern would go away?
If that all holds, then why wouldn’t the recommendation in the Design guide look more like a table where one axis is SD card size and the other axis is System RAM and X’s are used to indicate which RAM sizes are supported in which SD card capacities?
I feel like I’m missing something. Thanks so much for taking time to reply.
Well, the partition table is a little different for SD/USBs. The traces need be stored in /locker and the core dumps need to be storage in the diagnostic partition. So there is no difference for VSAN traces when it comes to different size SD/USBs, thus no table in the design guide.
For PSODs, I think that the recommendation is because the size of the diagnostic partition is quite small by default and cannot be modified. When you have a large amount of memory, you must basically select another location, and of course we cannot dump the cores to the VSAN datastore. I haven’t tried recreating a new core partition on the extra space on a larger SD card, but I’m guessing it would work. I’m not sure if there are other supportability concerns with that, so it would be worth checking with GSS. There is probably another reason that I am missing, but the guidance we received was that with memory above 512GB, you should dump cores to a disk and not the USB/SD card.
There is additional information in KB 2012362 about sizing the core dump partition, including considerations around slot size.
Ok, thanks again for the info.
very informative 🙂