VSAN 6.2 Part 3 – Software Checksum

checksumThis next part of the VSAN 6.2 series of posts focuses on an important feature which many customer have been requesting. VSAN 6.2 introduces another new feature, end-to-end software checksum, to help customers avoid data integrity issues arising due to problems on the underlying storage media. In VSAN 6.2, checksum is enabled by default, but may be enabled or disabled on per virtual machine/object basis via VM storage policies. Checksum is enabled by default as we feel customers will always want to leverage this great new feature. The only reason one might disable it is if the application already has this functionality included.

The new capability for checksum is called ‘Disable object checksum’. It may be selected, and disabled, when creating a VM Storage Policy as shown below. Otherwise it is always enabled.

checksumBrief Overview of Checksum on VSAN

Checksum on VSAN is implemented using the very common cyclic redundancy check CRC-32C (Castagnoli) for best performance, utilizing special CPU instructions on Intel processors. Every 4KB block will have a checksum associated with it. The checksum is 5 bytes in size. When the data is written, the checksum is verified on the same host where the data originates to ensure that if there is any corruption in-flight over the network, it is caught. The checksum is persisted with the data.

On a subsequent read of the data, if checksum is enabled, the checksum data is also requested. If the checksum reveals that the data block that was just read is in some way corrupted, then in the case of RAID-1 objects, the correct data is read from the other replica/mirror. In the case of RAID-5/RAID-6 objects, the data block is reconstructed from the other components in the RAID stripe. An error is also logged to the vmkernel.log file on the host that contains the device where the component erred, as well as on the host where the VM runs. In the example below, we deliberately overwrote a zero’ed out block of data with a random pattern of data, and then read the data via the Guest OS:

2016-02-16T07:31:44.082Z cpu0:33075)LSOM: RCDomCompletion:6706: \
Throttled: Checksum error detected on component \
a3fbc156-3573-4f2c-f257-0050560217f4 \
(computed CRC 0x6e4179d7 != saved CRC 0x0)

2016-02-16T07:31:44.086Z cpu0:33223)LSOM: LSOMScrubReadComplete:1958: \
Throttled: Checksum error detected on component \
a3fbc156-3573-4f2c-f257-0050560217f4, data offset 524288 \
(computed CRC 0x6e4179d7 != saved CRC 0x0)

2016-02-16T07:31:44.096Z cpu1:82528)WARNING: DOM: \
DOMScrubberAddCompErrorFixedVob:327: Virtual SAN detected and fixed a \
medium or checksum error for component \
a3fbc156-3573-4f2c-f257-0050560217f4 \
on disk group 521f5f1b-c59a-0fe2-bdc0-d1236798437c

Scrubber mechanism

Alongside the checksum verification on read operations, VSAN also has a  scrubber mechanism which checks that the data on disk does not have any silent corruption. This scrubber is designed to check all of the data once a year, but this can be tuned via the advanced setting VSAN.ObjectScrubsPerYear to run more often. For instance, if you want this to check all of the data once a week, set this to 52, but be aware that there will be some performance overhead when this operation runs.

Conclusion

Checksum is fully supported with all of the new features, such as RAID-5/RAID-6, deduplication and compression and configurations such as VSAN stretched cluster. As mentioned, it is on by default so customers simply get the benefit without having to configure it. And if you find you don’t want it, for some reason or other, simply disable it in your VM Storage Policy as shown above. This feature will enable VSAN customers to detect data corruption, due to “latent sector errors” which are typically due to physical drive problems, or other silent data corruption.

8 comments
  1. Is it possible to control when exactly you want object scrub to occur (such as during a maintenance window on a specific day)? If not, is it possible to see exactly when it will fire next?

    • I don’t believe so. I don’t think there is much customer-facing detail or configuration options other than how often to run. But this is a good question, so let me look into it.

  2. what kind of errors this checksumm algorithm in this particular configuration – 5 bytes checksumm for 4K bytes of data can detect – one bit changed, two bits?

    • Not sure I understand. But if there is any difference in the 4KB block of data that is read compared to what was written, then that difference is detected and automatically corrected in VSAN 6.2 by the software checksum feature.

  3. On a system upgraded to 6.2, this policy is not visible in ou rexisting policies. I assume checksumming is enabled by default.
    What happened to alle the VM’s and their objects. Did the system calculate checksums for all the objects and apply them after the upgrade? Or do we have to take action ourselves to ensure that those existing VM’s start getting covered by checksums?

    • It should be there, although I’ve heard folks state that they needed to reboot their vCenter after the upgrade for the new policies to become visible.

      The checksum is enabled by default and is calculated and stored for all existing data blocks. You don’t need to do anything else.

  4. On a read, with any of r1/5/6 if the CRC fails then the data is either read from the mirror or reconstructed from the XOR sum. Is this result then also checked against the checksum or is it just assumed the recovery worked? If one could be wrong then the other (mirror or XOR reconstruct) could also be wrong. Also, what happens if the check sum fails and the backing store is already operating with 0 safety (1 failed unit in R1/5, 2 in R6)? Do you just get a read error?

    • I asked engineering Thom – they stated the following: “On recovery read, we assume the reconstructed data is correct and compute a crc from the reconstructed data. This is expected because when we read the data to reconstruct the missed data, crc has already been checked. We cannot verify against the original crc because we do not know whether the crc or the data are wrong.

      If you get another error on a degraded store, an I/O error will be returned.”

Comments are closed.