In some previous posts, I highlighted how VVols introduces the concept of “undo” format snapshots where the VM is always running on the base disk. I also mentioned that this has a direct impact on the way that we do snapshots on VMs that support VSS, the Microsoft Volume Shadow Copy Service. But before getting into the detail regarding how VVols is different, it’s worth spending some time understanding whats going on when VSS is called to quiesce applications when a traditional snapshot is taken. If you try to research this yourself, you’ll find that there is very little information describing what is going on. The best place I found this behaviour described is in the Designing Backup Solutions for VMware vSphere vStorage APIs for Data Protection 1.2:
Windows 2008 application level quiescing is performed using a hardware snapshot provider. After quiescing the virtual machine, the hardware snapshot provider creates two redo logs per disk: one for the live virtual machine writes and another for the VSS and writers in the guest to modify the disks after the snapshot operation as part of the quiescing operations. The snapshot configuration information reports this second redo log as part of the snapshot. This redo log represents the quiesced state of all the applications in the guest. This redo log must be opened for backup with VDDK 1.2.
Even after reading the above, I wasn’t 100% clear on what we were doing or how we were doing it. I did a bit of reading and testing, as well as had numerous discussions to figure it out. To begin with, its important to understand that we are dealing with two snapshot related technologies – the VM snapshot and the shadow copy service from Microsoft. I’ll attempt to clarify which one I am talking about in this post by using these terms.
To describe VSS in a nutshell, vSphere makes a VSS request to each of the application’s VSS writers to “freeze” the application. This occurs when the quiesce option is chosen when taking a snapshot, and the guest OS supports VSS. Typically this is used to make a consistent backup of the application. At the same time, the VSS provider in the guest OS creates a shadow copy/snapshot of the application. Once the shadow copy is created, holding a consistent state of the application, writes can once again resume against the application. I found this explanation on VSS very informative.
However, because we are also taking a VM snapshot as well, we have no way of knowing if the shadow copy is completed. It is very conceivable that the shadow copy of the application has not completed when we take the VM snapshot. In other words, it is not possible to create a VM snapshot at the exact point when an application is considered consistent. But by taking a snapshot of the VM when it is running on traditional NFS/VMFS storage, the base disk is placed in a read-only and the running point of the VM becomes the snapshot. We cannot leave the base disk writable and have writable point-in-time snapshot in its chain. This is fundamental to the way snapshots work, as any writes to the base disk will corrupt any descendant snapshots in the chain. In a redo log hierarchy such as this, a snapshot must be considered “immutable” because its descendants in the chain depend on it to be “frozen in time”. This is due to the fact that these snapshot descendants in the chain inherit a set of unchanged blocks from its parent or parents. Writing to any link other than the bottom most link in a redo-log hierarchy will lead to data corruption. So how can we complete the creation of the shadow copy for the application, and guarantee an application consistent snapshot?
The traditional approach
This is where the second redo log comes in. This allows the creation of the shadow copy to complete, and then allow the application to be “unfrozen/unquiesced”. Therefore this additional snapshot must be made writable, and allow changes.
The way this works is that first we will create a writable snapshot against the running point of the VM. This results in an otherwise normal snapshot consistent with the running point. However, we now require the application to commit all outstanding I/Os to the snapshot to put it in an application consistent state. To do this, the writable snapshot is reattached as a separate virtual disk to the VM.
Remember that this writable snapshot does not capture a point-in-time (PIT) delta of the guest OS. It is simply an artifact created on behalf of the VSS shadow copy service, and allows in-flight I/Os belonging to the application to be caught so that an application can be placed into a consistent state, typically for the purposes of backup. In traditional configurations (VMFS, NFS), the creation of two separate redo logs can be observed:
DISKLIB-LIB_CREATE : CREATE CHILD: "/vmfs/volumes/54217270-9baa46c8-20b3-e4115baa8e42/swizzle-yeah/ \ swizzle-yeah-000001.vmdk" -- vmfsSparse cowGran=1 allocType=0 policy='' . . DISKLIB-LIB_CREATE : CREATE CHILD: "/vmfs/volumes/54217270-9baa46c8-20b3-e4115baa8e42/swizzle-yeah/ \ swizzle-yeah-000002.vmdk" -- vmfsSparse cowGran=1 allocType=0 policy=''
This attaching the writable snapshot as a disk to the VM can be observed in the VM logs:
ToolsBackup: hot adding disk swizzle-yeah-000002.vmdk to node scsi0:1 . . HotAdd: Adding disk with mode 'persistent' to scsi0:1 . . ToolsBackup: successfully mounted writable snapshot in guest.
Once the writable snapshot is attached to the VM, all I/Os that are needed to put the snapshot in a consistent state are issued to the writable snapshot. Once this process completes, the writable snapshot is removed from the VM and the process is considered complete.
ToolsBackup: hot removing disk swizzle-yeah-000002.vmdk from node scsi0:1. . . Closing disk scsi0:1 . . ToolsBackup: Post-processing writable snapshot disk
Now we have an application consistent snapshot of the VM.
There were some issues with this approach, namely the inability to offload the snapshot operation to a VAAI-NAS array, as per this blog article about VSS and application level quiescing in Windows 2008. In the next section we will see how VVols addresses some of these issues.
The VVol approach
VVol snapshots changes this behaviour once again. If we cast our minds back to an earlier post I did on VVol snapshots, you may recall that VVol snapshots allows the VM to continue running on the base disk. There are no snapshot chains so to speak, as every snapshot is a point in time (PIT) copy based on the state of the base disk. Conceptually, you could consider VVol snapshots relationship to the VM as looking something as follows. Note that there is no chain of dependencies between the delta.
Now consider a snapshot request which include a quiesce request against the applications. With VVols using this snapshot functionality, there is no reason to take an additional snapshot just for VSS and its writers to commit outstanding I/O for application consistency. The reason for this is that we can present a delta, point in time (PIT) snapshot back to the VM as a writable snapshot, without impacting any of the other PITs in the snapshot chain (as there is no chain per-se). Taking the above example, lets assume that a third VVol snapshot is requested, and the request includes a requirement for the applications to be consistent. This invokes the VSS. The PIT snapshot is taken, and the the “swizzling” process takes place to re-parent the snapshot. As discussed, with VVol there are not chains, but every snapshot points back to the parent:
Once the VSS writers have done their thing, as before, the snapshot can be hot removed from the VM and the process is once again considered complete. The third snapshot will become just another PIT VVol snapshot, but it will once again be application consistent.
To reiterate, the reason we could not do this with traditional VM Snapshots on NFS and VMFS is because the snapshots is the change must remain “immutable”; i.e. they cannot change. This is because snapshots further down the chain rely on them to remain unchanged.
However, through VVols, we no longer need to create a second redo log just for VSS shadow copy services. This is a major change to how we’ve done snapshots in the past and another example of how VVols is making our snapshot process more efficient.