DR of vCenter Operations – Method 2 (IP Customization)

Earlier this week I spoke about our efforts to failover vCenter Operations Manager (vCops) between two sites. In that article I stated that we used vApp containers at DR site, and added vApp variables to the Analytics and UI VMs at the recovery site. While this was painstaking to set up initially, it did provide us with the ability to failover vCops seamlessly to the DR site, with the vApp VMs inheriting their network settings via the vApp construct. At the end of that post, I mentioned a KB article, 2031891, which discusses the DR of vCops using IP Customization via SRM Recovery Plans. This also used a Resource Pool to hold the vCops VMs rather than a vApp. I will cover our experiences with that approach in this post.

In this particular test, we are failing over the vCenter Server Appliance (VCSA) and vCops (version 5.8.2) at the same time. We have also configured DNS across the board, and have used KB article  2017835 to ensure that vCops is launched using FQDNs (the previous vCops DR post has all the details). One thing we noticed was that the Analytics VM was not displaying its DNS name in the VM Summary tab, whereas the UI VM was. To address this, we ran the utility  /opt/vmware/share/vami/vami_config_net and selected option 3 to change the hostname from localhost.localdom to our correct FQDN:

1.setup vcopsan hostnameOn checking the /etc/hosts file, the analytics  VM now had a good FQDN hostname. We were now ready to configure the recovery plan. I had two separate protection groups, one for VCSA and one for vCops. I added both to the Recovery Plan. These were the Recovery Plan settings we gave to each VM:

UI VM

  • Set up IP customization
  • Add a dependency on the Analytics VM
  • Set the VMware Tools timeout to 30 minutes

Analytics VM

  • Set up IP customization
  • Add a dependency on the VCSA

VCSA

  • Set up IP customization
  • Add a prompt (at which time, the recovery plan is paused and you can go ahead and make the appropriate DNS updates to your IPAM system. If you are doing a test failover and placing the VMs live on the network, you can also use this time to remove network uplinks from the VMs at the Production site so that there is no contention between them)

We first ran a test failover, but brought the VCSA & vCops VMs up on the network rather than an auto/bubble network (which meant we had to disconnect the uplinks of the VMs on the Prod site when the prompt step was reached). The test ran successfully, but it did take a total of 17 minutes to failover the VMs from start to finish. One of the problems is that the UI VM tries to communicate with the Analytics VM before the guest customization is complete. This means that timeouts need to expire, with the UI VM displaying “ssh:connect to host secondvm-external port 22: No route to host” repeatedly. This would appear to be because SRM removes network uplinks to VMs for IP customization, and then adds them back afterwards (Reconfigure virtual machine tasks can be observed taking place during this time). It is the reason why we wait so long (30 minutes) for VMware Tools to come active on this VM during failover (we need tools for the customization). This time alone would nudge me towards using the vApp construct methodology of my previous post as failovers were a lot quicker using that method.

Note that because we are not using a vApp construct for vCops on the DR site, this will lead to a number of errors due to “Unable to find OVF environment“. Once we were finished we cleaned up the test failover environment.

We then went for a full failover (and since the Prod site was still up, the VMs were powered off).

5. FailoverEverything successfully failed over, but this also took a long time to complete (17 minutes once again).

6. failover completeAnd at completion, the VCSA and the vCops was fully functional on the DR site. When we logged into the vCops Admin UI, we could see the registration to vCenter was still intact and there was no prompt to re-register. We could also test the connection and update the registration successfully. There was no need to re-register vCops with VCSA on the DR site. This is because we were using DNS across the board, and no IP addresses. Nor did we need to run a repair script. Check back on the previous post for a full list of steps to fully configure vCops for DNS/FQDNs.

Summary

The purpose of this exercise was to show that an alternate method can be used to DR  your vCops infrastructure other than using the vApp option in the original DR post. However, even though it is possible, it takes far longer than the vApp method. Given the choice, although more configuration steps are required, I would go for the vApp option – its seems more elegant than the IP customization method.

One comment

Comments are closed.