Network Virtualization (NSX) and vSphere Data Protection Interop

NSX-300x206In this third article in the series of backing up the vCloud Suite, we turn our attentions to NSX, VMware’s Network Virtualization product. Before starting, I should point out that NSX has a recommended way of backing up and restoring configuration information via the use of an FTP server, which you need to configure in your infrastructure to hold this exported metadata. However this exercise looks at how you might be able to use VDP to back up and restore an NSX configuration using image level backups. Once again, I wanted to see whether I could restore the NSX environment to a particular point in time, in-place and also by restoring to a new location. This is the same infrastructure that I used for backing up and restoring vCops and backing up and restoring vCAC and VCO. On this occasion, I was using NSX version 6.0.4, vCenter 5.5U1 and VDP version 5.5.6.46.

NSX Overview

I don’t want to spend a lot of time on this, but I did want to provide a brief overview of the various appliances/virtual machines which go to make up an NSX deployment.

The NSX Manager is the centralized network management component of VMware NSX for vSphere. It is installed as a virtual appliance on any ESX™ host in your vCenter Server environment.

The NSX Controller is the control-plane of the NSX solution. It is deployed in a three-node cluster and the virtual appliances provide, maintain and update the state of all network function.

The figure below shows my NSX Manager and NSX Controllers configuration with a normal status:

1. nsx controllersThe Edge Services Gateway (ESG) also referred to as the NSX Edge is a virtual appliance that can provide routing, firewall, load balancer, VPN, Layer 2 bridging services and more. These are deployed on an as needed basis, depending on the services required for a particular network. NSX Edges may also be deployed in a HA-pair configuration for high availability.

The VMware Endpoint offloads antivirus and anti-malware agent processing to a dedicated secure virtual appliance delivered by VMware partners such as Trend Micro’s Deep Security product.

Backing up and restoring NSX Manager, Controllers and Edge are within the scope of this test. VMware Endpoints were not. In my simplified configuration, I had a single NSX Edge deployed which provider a DHCP service to any VMs deployed on that network. Before starting, I deployed my test VM and verified that it was successfully picking up a DHCP IP address – it was. Caveat: The NSX Edge – DHCP configuration is shown below – no other services were enabled, as I made the assumption that a single service should be good enough and if that was still working after a restore, then other services would presumably work as well.

6.5 nsx edgeRestoring NSX to a new location

In this first test, I will concentrate on backing up the NSX Manager VM, the 3 NSX controller VMs, and the 2 NSX edge VMs, and do a restore to a new location. I will then power down the existing NSX appliances and boot up the restored appliances and see if everything is manageable from a vSphere perspective. Post restore, I will then verify that my demo VM still gets a valid DHCP IP Address by doing a release/renew operation within the Guest OS.

I created a backup job which included the 6 NSX VMs (Manager x 1, Controllers x 3, Edge x 2 [HA pair]). Details on how to create the backup job can be found in the appropriate VDP documentation. I started the backup and it proceeded as expected.

Observation: After the backup completed, I noticed that my 3 NSX Controllers all had a Snapshot Consolidation needed message. I’m not sure why this is the case, and further investigation is needed. A simple snapshot consolidation operation on the NSX Controllers via the vSphere client cleared the issue.

I then proceeded to shut down my current NSX infrastructure. I used the following shutdown order:

  •  Edges first
  • Controllers next
  • Manager last

 As expected, I could no longer see any NSX Manager, Controllers or Edges in the vSphere web client. As a test I also rebooted my VM and saw that it was no longer picking up a DHCP IP address.

Now it was time to restore my NSX configuration from VDP, and see if it came online. I selected all 6 NSX VMs for my restore job. The restore completed successfully.

24. all nsx vms restoredI then moved all 6 NSX VMs into their own VM folder for ease of management. I proceeded to power the VMs back on using the following power up order:

  • Manager first
  • Controllers next
  • Edges last

Checking the NSX configuration post restore, I saw that I could once again see my NSX Manager and 3 NSX Controllers which showed a status of Normal. All good so far. However on examining the NSX Edge, I saw the following status:

Failed to reboot vShield edge vm. Failed to reboot guest on VM {0}.

I didn’t know how serious this status was, so I decided to see if the DHCP service was working. Unfortunately, it was not. My test VM was unable to contact my DHCP server. In fact, I could not even ping the NSX Edge IP address.

Troubleshooting NSX Edge

I logged onto the NSX Edge as admin and used some of the CLI commands as described in NSX CLI reference guide. The show interface command indicated that no interfaces were up. I decided to look back at the log messages since the NSX Edge was powered up after restore using the show log command and saw a number of messages related to “system in bad state” but going back earlier in the logs, the crux of the issue seems to be the following:

config: ERROR :: C_INTF_UTIL :: [73001]invalid MAC address 00:50:56:xx:xx:xx

config: ERROR :: VseCommandHandler :: Configuration request failed eventually. Error: [C_INTF_UTIL][73001]invalid MAC address 00:50:56:xx:xx:xx at /opt/vmware/vshield//Framework/lib.pm line 89.

It would appear to me that there is some dependency on the MAC address of the original NSX Edge. When the Edge is restored to a new location, it picks up new MAC address which caused configuration errors when that NSX Edge is powered on.

Bringing a Restored NSX Edge online

I decided to see if there were some ways to get the restored NSX Edge to come online and provide services. I attempted a “force sync” operation from the UI, but this failed almost immediately and the “Failed to reboot vShield edge …” message reappeared. I then decided to “redeploy” the NSX edge. This seemed to work, and the status now changed to “Deployed” once again. I could once again ping the edge.

The final step was to test that the DHCP service was working. It was, since my test VM could once again pick up an IP address.

Observation: A redeployed NSX Edge is automatically placed in the Discovered virtual machine folder in the vCenter inventory. Although I moved my NSX VMs to a different folder for manageability, the redeploy operation placed the redeployed NSX Edge back in the Discovered virtual machine folder while leaving my  restored but non-working Edge in my folder.

Based on this test, I concluded that NSX Manager and Controllers can be backed up and restored to a new location by VDP. NSX Edges cannot be restored to a new location by VDP, as there appears to be a dependency on the original MAC address of the appliance and when it is restored to a new location, it gets a new MAC which impacts the NSX Edge configuration.

At this point, I do not think we can successfully restore an Edge device to a new location using VDP (there might be ways around this by modifying the MAC address, etc, but we really don’t want or recommend getting into this). I would say that we can restore the NSX Manager and its Controllers to a new location successfully, but any NSX Edges  require a redeploy.

Restore NSX to it’s original location

The second part of the test was to attempt a restore using VDP to the original location. Backup is done the same as before, but before attempting the restore I need to power down the current NSX appliances (Manager/Controllers/Edges). I used the following shutdown order same as before, Edges first, Controllers next and Manager last.

The backup and restore operations were both successful, so I powered on the VMs in reverse order. Once up and running, I checked the NSX configuration status:

  • NSX Manager –  ok
  • NSX Controllers x 3  – Status: Normal
  • Host Prep – all ok
  • Service Deployment – VMware Endpoint – status: UP
  • Logical Switch – Status: normal
  • NSX Edge – Status:Deployed (note this was failed to reboot previously)
  • DHCP Service: UP

From my test VM, I did an ipconfig /release and then a /renew and this time I managed to get my DHCP IP address from my NSX Edge, suggesting that the service is working. Since my NSX Edge did not change position, it continued to have the same MAC address as before, and thus the configuration remained valid.

It would seem that NSX Manager, Controllers and Edge can be backed up and restored to the original location by VDP.

Having discussed the behaviour of the NSX Edge when restored to a new location, it might be considered a best practice to redeploy NSX Edges post restore anyway simply because you may have made a configuration change via the NSX Manager which may not be reflected in the restored NSX Edge. Doing a redeploy post restore will ensure that the NSX Manager and it’s Edges are synchronized. Remember that a redeploy places the Edge in the Discovered Virtual Machines folder.

Disclaimer

Again, like my previous VDP tests against vCloud Suite components, this was a very simple setup. I wish to reiterate that the only NSX Edge service that was configured during these tests was DHCP and no VMware Endpoints were included in the tests.

Question for readers?

How do you backup your vCloud Suite components? What tools/backup products do you use? How do you verify consistency and prove that your backups are valid. I’d really like to hear from you, so please leave a comment if you can.

One comment

Comments are closed.