This is something I only learnt about very recently, and something I was unaware of. It seems that we have made a major improvement to the way we do snapshot consolidation in vSphere 6.0. Many of you will be aware of the fact that when they VM is very busy, snapshot consolidation may need to go through multiple iterations before we can successfully complete the consolidation/roll-up operation. In fact, there are situations where the snapshot consolidation operation could even fail if there is too much I/O.
What we did previously is used a helper snapshot, and redirected all the new I/Os to this helper snapshot while we consolidated the original chain. Once the original chain is consolidated, we then did a calculation to see how long it would take to consolidate the helper snapshot. It could be that this helper snapshot has grown considerably during the consolidate operation. If the time to consolidate the helper is within a certain time-frame (12 seconds), we stunned the VM and consolidated the helper snapshot into the base disk. If it was outside the acceptable time-frame, then we repeated the process (new helper snapshot while we consolidated original helper snapshot) until the helper could be committed to the base disk within the acceptable time-frame.
There is a new snapshot format introduced in VSAN 6.0 called vsanSparse. These replace the traditional vmfsSparse format (redo logs). The vmfsSparse format was used when snapshots of VMs were taken in VSAN 5.5, and are also the format used when a snapshot is taken of a VM residing on traditional VMFS and NFS. The older vmfsSparse format left a lot to be desired when it came to performance and scalability. This KB article from our support team, indicating that no snapshot should be used for more than 72 hours, and snapshot chains should contain no more than 2-3 snapshots, speaks for itself.
Well, I got so many questions about my previous articles on a new way of doing snapshots with VVols that I decided to take the time and get even deeper into their behaviour. In this setup, I take a Windows 2008 Guest OS running in a virtual machine deployed on an NFS datastore, and I compare it to an identical VM deployed on a VVol datastore. This is purely from looking at how we do snapshots. Remember with VVols, snapshots always run on the base disk, compared to the traditional way of doing snapshots where the VM always run on the top-most delta in the chain.
Let’s begin this post with a recap of the Storage vMotion enhancements made in vSphere 5.0. Storage vMotion in vSphere 5.0 enabled the migration of virtual machines with snapshots and also the migration of linked clones. It also introduced a new mirroring architecture which mirrors the changed disk blocks after they have been copied to the destination, i.e. we fork a write to both source and destination using mirror mode. This means migrations can be done in a single copy operation. Mirroring I/O between the source and the destination disks has significant gains when compared to the iterative disk pre-copy changed block tracking (CBT) mechanism in the previous version & means more predictable (and shorter) migration time.