NSX ALB v22.1.1 – New Setup Steps

Many readers with an interest in Kubernetes, and particularly Tanzu, will be well aware that there is no embedded Load Balancer service provider available in vSphere. Instead, the Load Balancer service needs to be provided through an external source. VMware supports a number of different mechanisms to provide such a service for Tanzu. One of the more popular providers is the NSX Advanced Load Balancer, formerly Avi Vantage. In the most recent release, version 22.1.1, some of the setup steps have changed significantly. In this post, I will highlight the setup of the new NSX ALB.

Important: NSX ALB v22.1.1 has an option to choose a Content Library to store the Service Engine images. This should be created prior to configuration of the NSX ALB.

Note: There are a lot of screenshots to show you here, so the post is rather long. Hopefully it will be helpful though.

Part 1 – Deploy the NSX ALB OVA

The NSX ALB OVA deployment is very similar to many other OVAs. The only real consideration is whether or not the NSX ALB will be integrated with NSX-T, possibly providing L7 ingress functionality. This is not the case here so the deployment will be very straight-forward. It is made even simpler due to the fact that I am choosing to use DHCP for the networking details. Here are the 8 steps of the OVA deployment.

1.1 Provide the path/url to the OVA

1.2 Provide a name for the NSX ALB controller

1.3 Select a Compute Resource

1.4 Review details – is this the correct OVA?

1.5 Select Storage Policy and vSphere Datastore

1.6 Select networks

Select the network on which the management interface of the NSX ALB controller will reside.

1.7 Customize the template, or simply leave everything at DHCP

As mentioned earlier, other than network information, the only other details required here are NSX-T integration details. If no NSX-T integration is planned, this can be left empty.

1.8 Ready to Complete? Click Finish

And that completes the initial deployment. The NSX ALB controller will take a few minutes to come online. If you try to connect a browser to the IP Address or FQDN of the controller in the meantime, you may see messages similar to the following:

You may also observe an nginx error before the controller is ready.

However, eventually the controller should come online and you will be able to connect to it. This means that we can now proceed with the configuration of the NSX ALB so that it can act as a Load Balancer provider for our Tanzu deployments.

Part 2 – Configure the NSX ALB for Tanzu Kubernetes

We can outline the steps involved in configuring the NSX ALB for Tanzu Kubernetes as follows:

  1. Create a basic admin account on NSX ALB and provide basic admin details, such as DNS.
  2. Set the Default-Cloud infrastructure type to vSphere, add credentials to connect to vCenter Server and provide Content Library details if used.
  3. Create IPAM (IP Address Management) Profile for the Load Balancer network. This is so that NSX ALB knows which network and which IP addresses to allocate as a provider.  Add the IPAM Profile and management network details to Default-Cloud.
  4. Configure Load Balancer/VIP Network, selecting between DHCP or Static IP Address Pool for VIPs and Service Engines. Setup Virtual Routing and Forwarding (VRF) between workload network and load balancer network, if necessary.
  5. Create a self-signed SSL certificate to allow NSX ALB to trust connections from Tanzu.
  6. Add Licensing details

Let’s look at those configuration steps in greater detail.

2.1 Basic admin setup

Once connected to the controller web interface, you are prompted to provide a password and an (optional) email address. This enables the “Create Account” button. Click it to proceed.

In the Welcome screen, passphrases as well as DNS information is requested. There is also some additional Email/SMTP settings which may be populated, as well as information around multi-tenancy. For on-premises vSphere deployments, the latter is not used (to the best of my knowledge). The only other key thing to point out is that you should select the checkbox at the bottom of the screen to “Setup Cloud After” before hitting Save. This is often missed, though you can manually setup the Cloud instance later on if you omit to click the checkbox.

This takes us to the Default-Cloud configuration, which is where most of the setup is implemented.

2.2 Default-Cloud Setup – vCenter Connectivity

After clicking Save, you will be brought to the Default-Cloud screen. It will look similar to the following:

Note that the Type is currently set to “No Orchestrator“. You will have to click on the cog highlighted in the screenshot above to change it to VMware vCenter / vSphere ESX.

A common question at this point is whether or not a new cloud could be created for Tanzu Kubernetes. The answer is no. The AKO (Avi Kubernetes Operator) that is deployed in the Tanzu Kubernetes clusters, and which communicates to the NSX ALB, expects to find a cloud called Default-Cloud in order to get items such as Load Balancer Virtual IP Addresses (VIPs). Thus new clouds should not be created.

The next step is to edit the Default-Cloud settings and add the vCenter Server credentials. Click on the pencil icon next to the cog icon selected previously. This will open the general setting view. Navigate down to the vCenter Server details, or select vCenter Server in the tabs across the top of the edit screen. Add the credentials and click “Connect“.

This will automatically discover the data centers on that vCenter server. Select the appropriate DC (if there is only one, it is automatically selected), and then select the appropriate content library for the Service Engine images. In this example, the content library is called avi.

Note the information in blue – the configuration needs a “Save and Relaunch” to complete the setup. Click the “Save and Relaunch” button and follow the remaining setup steps.

This should trigger the syncing of Service Engine images to the content library.

2.3 Default-Cloud Setup – Networking

Two steps now remain. The first is to select the Management Network, and the second is to create an IPAM Profile. The management network provides communication between the vSphere infrastructure and the NSX ALB controller. The management network is also used by the Service Engines. The can be configured to use either DHCP, or a static IP Address range can also be provided. I have chosen DHCP.

If you do not want to use DHCP, but would prefer to use a static range of IP addresses, then copy the steps outlined in section 2.4. That section shows how to set up a static range for the Load Balancer network. Simply do a similar task for the management network.

The final Default-Cloud setup step is the creation of an IPAM (IP Address Management) Profile. Basically, when the AKO on the Tanzu Clusters request a VIP / Load Balancer IP Address, IPAM will manage and track the provisioning of the IP addresses. Click on the 3 vertical dots to the right to create a new profile.

Provide a name for the IPAM Profile. The type is Avi Vantage IPAM, the Cloud is the Default-Cloud and the network is the Load Balancer network. Click on the “Add” button under Usable Networks to choose from the list of available/discovered networks. When the information is populated, click on “Save” to save the IPAM Profile.

The Default-Cloud configuration is now complete. Click on “Save” to save the settings made to the cloud.

2.4 Load Balancer / VIP Network Configuration

In the previous step, we told the IPAM in the default cloud which network we would use to provision IP addresses for the Load Balancer Service. Now we need to configure this network. Considerations are whether the VIPs be provided from DHCP, or will they be provided from a static IP range? Will these IP addresses be used for both the Service Engine and the VIPs? The Service Engines are also connected to VIP network. Under Infrastructure > Cloud Resource, select Networks. To edit the network configuration, click on the pencil icon next to the network that was configured with the IPAM Profile to provide Load Balancer IP addresses.

I am not going to use DHCP for this network. Instead I will add a static range. Click on the “+ Add Subnet” button to proceed.

By default, the Service Engines and VIPs are on the same network, but this does not need to be the case, and separate networks (or network segments) could be setup for both.In this case, the configuration is left at the default. In the simplified version below, I am choosing 16 IP addresses from the subnet range of 64 IP addresses. These IP addresses will be used for both VIP requests from Tanzu Kubernetes as well as the SEs. Since I am not planning anything large scale in my lab, this should meet my purposes. You may want to choose a much larger range, especially for production deployments.

There is one more edit to make. Since my workload network is on a separate network to my VIP network, I need to add a static route so that the Service Engines know how to redirect requests between the workload network and the VIP network. The Service Engines are connected to the management network and the VIP network, but are not on the workload network. I will qualify this last statement by saying that it is mostly true unless the VIP and workload addresses are from different segments on the same network, which could be the case. Remember that clients will communicate to the VIP address, but that those requests need to be sent to the endpoint on the workload network. To setup the static route, navigate to Infrastructure > Cloud Resource and select VRF Context, VRF short for Virtual Routing and Forwarding. Click on the pencil icon to edit the global VRF context.

The next step is to add the static route between the workload network and the VIP network. Set the Gateway subnet to be the workload network and the next hop to be the gateway on the LB  VIP network. Once added, click Save.

After saving the static route, check the status of the Default-Cloud cloud. It should show a status of green.

We can now proceed with the remaining setup step, namely creating an SSL certificate so that the NSX ALB will trust the requests for VIPs from a Tanzu Kubernetes cluster.

2.5 Modify Access Settings / Create new certificate

To create a new self-signed certificate, select Administration > Settings and then select Access Settings. Click on the pencil icon to edit the settings.

This should now be on familiar territory for those of you who have set up the NSX ALB in the past. The usual steps apply whereby the existing SSL/TLS Certificates are deleted and a new self-signed cert is created for the NSX ALB controller.

These are the default certs (system-Default-Portal-Cert-*) in the SSL/TLS Certificate section.

Click on the X in the certificate name to delete them from the list.

Now click on the down-arrow (chevron) in the SSL/TLS Certificate and select the option to create a new certificate.

In the new certificate, provide the FQDN of the NSX ALB controller as the Common Name. In the lower part of the UI, under Subject Alternate Name (SAN), click on ADD. The remaining fields can be ignored or left at the default.

The Subject Alternate Name is the IP address of the NSX ALB controller. After adding this, click on the Save button. Note: after deleting the older certs and creating the new self-signed certificate that was just created, you may need to refresh your browser and log in to the admin portal once again.

The certificate can then be retrieved from Templates > Security. The newly created SSL certificate should be visible in the list of certificates. Click on the download icon at the right hand site to open a download window. This will be required when deploying Tanzu Kubernetes with the NSX ALB acting as a Load Balancer provider.

A final setting in the Access Controls is whether or not you wish to allow basic authentication. The choice is yours, but I’ve been told in the past that if security is a major concern, it is best not to enable this option. I’m showing it here for completeness.

2.6 Licensing

Last but not least we come to licensing. Don’t forget this, or your Service Engines will never deploy even though everything else appears to be working. This will mean that even though your NSX ALB has allocated VIPs as requested by Tanzu, you won’t be able to route to these IP addresses since the Service Engines provide this routing. Thus, you won’t be able to communicate with the Kubernetes API servers on these clusters.

To access the licensing section on the NSX ALB, navigate to Administration > Licensing. Click on the ‘cog’ icon to select the appropriate license tier and add the key. In this example, I have chosen Enterprise Tier. Click Save to save the license tier selection.

You will next be prompted to add the license key. After adding it, either by pasting in the key or uploading a license file, click on “Apply Key”. The NSX ALB is now licensed.

That now completes the setup of the NSX ALB v22.1.1. As you can see, it is quite a bit different to previous versions of the NSX ALB, but hopefully there is enough information here to allow you to configure this newer version.

9 Replies to “NSX ALB v22.1.1 – New Setup Steps”

  1. That is just an awesome article, thank you.
    I am wondering since i am just starting my path with K8 etc.. why ALB vs Nsx deployment…i know i get Vsphere pods, besides that i see many advantages for ALB…

    1. Yes – I believe vSphere Pods is the most significant difference, but remember that in the current versions of vSphere with Tanzu, vSphere Pods form the basis of the embedded Harbor Registry service, the Velero backup service and other 3rd party services such as Minio. If you were interested in these features today, you would also need NSX-T.

      If these were not a requirement, and you wanted to deploy them manually in workload/guest cluster, then I agree that the NSX ALB is a great solution.

      1. you forgot that with nsx-t you have egress features and you don’t have it without nsx-t (at least for tkgs but you can enable it on tkgm because the version of antrea is higher).
        Also another big difference is with tkgm and nsx-alb you can have mutliple frontend network (for a lot of customer with zero trust is kind of mandatory that for example test can’t communicate with prod and for that they prefer to have multiple cluster each one on dedicated vlan and segregated by firewall)

  2. Thanks for the great article!
    But one question: Why should I use something like NSX Advanced Loadbalancer (which is quite complex) and not simply use MetalLB inside my K8s cluster?

    1. The first reason is that metallb cannot be used for the control plane nodes (to the best of my knowledge). I suppose you would need some other mechanism like kube-virt to use for the API server.
      Another reason would be how the load balancer mechanism is implemented. IIRC, metallb chooses one of the nodes and doesn’t do any sort of round-robin load balancing per se, but instead would behave as a sort of high available/failover mechanism. So one node could end up being the bottleneck for your app traffic. (nb – this behaviour may have changed, so worth investigating further in case it has).

  3. Well-written article.

    In my deployment, I have the management network, FrontEnd network (SE &VIPs) and Workload network. The Supervisor cluster was successfully created in vCenter, but when I checked on AVI, the Virtual Services were not healthy and were showing as connected to the front end network rather than the workload network. Do you have any clue?

    1. That’s ok Josh. The SEs are connected to 2 networks – they get a connection on the front-end/load balancer network and on the management network.

      1. As mentioned SEs seems to be fine. I can see the Supervisor Cluster VMs’ IP addresses (Workload Network) when I look at the tree view of the 2 virtual services: domain-c8-kube-sys and domain-c8-vmware-sys, however the portgroup is listed as Frontend portgroup..

        1. That is a bit strange. Not sure why that would be the case. The SV nodes should certainly be on the workload network to connect to the pool when you use the tree view.
          And then the virtual service IP addresses should of course be taken from the front-end/LB range. I’m not sure what would cause this behaviour if the SV nodes are showing up as using both the management network and the workload network in vSphere, and then are shown as using front-end in the NSX ALB UI.

Comments are closed.