One of the exciting new features of the Tanzu Kubernetes Grid (TKG) 1.3 release is the support for NSX Advanced Load Balancer (NSX ALB) as a Layer-4 load balancer solution for your Kubernetes (K8s) based workloads. Most recently, there were a couple of customer inquiries asking whether TKG 1.3 and NSX ALB is supported on VMware Cloud on AWS (VMConAWS) and the answer is yes!
I suspect part of the reason on why this question came up is that it may have been difficult to find a clear support stance for this configuration and although there is some documentation in the AVI Portal for installing NSX ALB on VMConAWS, it certainly was not easy to find. I personally also found the instructions to a be on the lighter side after reading through a few times. Since I already had my TKG Demo Appliance Fling deployed in my VMConAWS SDDC, it was easy enough to un-deploy my existing TKG Management Cluster and set it up with NSX ALB. You can find the detailed instructions below and although the setup of NSX ALB and TKG is similiar to an on-premises vSphere deployment as recently documented by Cormac Hogan, there are still some subtle differences, especially if you are not placing both TKG and NSX ALB systems all on the same single, which you may find in demos 🙂
Setup NSX ALB
Step 1 - Create NSX-T segments for TKG Management/Workload Cluster as well as for our NSX ALB deployment. For simplicity purposes, I am using the same network for all NSX ALB functionality, but you can certainly separate this out further.
- tkg-network - This network will be used to deploy our TKG Management and Workload Cluster and is defined as 192.168.2.0/24
- nsx-alb - This network will be used to deploy the NSX ALB Controller, Service Engine and Virtual IPs (VIPs) and is defined as 192.168.3.0/24
- NSX ALB supports both DHCP and Static reservations for its interfaces, so I have carved up the network as follows:
- 192.168.3.2-192.168.3.20 (Static)
- 192.168.3.21-192.168.3.50 (DHCP from NSX)
- 192.168.3.51-192.168.3.254 (NSX ALB IPAM)
- NSX ALB supports both DHCP and Static reservations for its interfaces, so I have carved up the network as follows:
Step 2 - Download the NSX ALB Controller (20.1.5) OVA for VMware and deploy it to your VMConAWS SDDC. Fill out the following parameters if you do not wish to use DHCP and power on the VM.
Step 3 - To configure NSX ALB, we will need to wait for the system to become ready and then access the NSX ALB UI using the IP Address instead of the FQDN, which will allow us to set the admin credentials
Step 4 - Upon completing the initial credential configuration, you will be taken to a welcome screen to configure a backup passphrase along with configuring your DNS and Email settings. You can leave all the defaults for the Multi-Tenant configuration and ensure the Setup Cloud After option is unchecked and click Save.
Step 5 - Navigate to Infrastructure->Networks and create a new network setting which NSX ALB will use for provisioning our services. In this example, I have chosen the name NSX-ALB-Segment and subnet is 192.168.3.0/24 with IP Pool of 192.168.3.51-192.168.3.254
Step 6 - Navigate to Infrastructure->Clouds and edit the Default-Cloud and click on the drop down to create a new IPAM/DNS profile.
Provide a name for the new IPAM profile and then select the NSX ALB Network that we had just created from the previous step.
Once the new IPAM profile has been selected, also make sure the Template Service Engine Group is using the Default-Group and then click Save to complete the configuration.
Step 7 - Navigate to Templates->Security and create a new Controller Certificate which will be required when we setup TKG
Fill out the parameters shown in the screenshot below, I have encoded the Subject Alternative Name (SAN) to include both the FQDN and IP Address of NSX ALB
After the Controller Certificate has been created, click on the little arrow icon to export the certificate. This is required when we setup TKG
Step 8 - We can now assign the certificate that we had just generated to our controller by navigating to Administration->Settings->Access Settings and updating the default certificate entry.
Step 9 - Navigate to Infrastructure->Clouds and click on the little arrow icon to generate the unique Service Engine OVA that will be associated with your NSX ALB deployment. This will take a few minutes and then you will be prompted to save the OVA. For your connivence, you should upload the OVA into a vSphere Content Library which will allow you to easily deploy new Service Engine without having to re-upload from your desktop.
Step 10 - Navigate to Infrastructure->Clouds and click on the key icon to retrieve both the Cluster UUID and Authentication Token which is required input when deploying the Service Engine OVA.
Step 11 - Deploy the Service Engine OVA and during the network selection, make sure Data Network 2 is configured to TKG network you had provisioned from Step 1.
Step 12 - When prompted for the NSX ALB configuration, provide the IP Address of your controller along with the Cluster UUID and Authentication Token. If you do not wish to use DHCP for the Service Engine Management Interface, you can provide the additional networking as shown in the example below.
Step 13 - If the Service Engine were deployed successfully, you should see them listed under Infrastructure->Service Engine tab. The last thing we need to do is to configure the network interfaces on the Service Engine to let NSX-ALB know which ones map to the two networks we had provisioned from Step 1. This was something I had initially missed because I had assumed the interface labels mapped to Service Engine VM but you need to carefully match up the VM MAC Address for Network adapter 2 and 3 within the Service Engine VM to the Service Engine UI. Carefully ensure this step is configured correctly or you can have connectivity issues accessing the load balancer IPs.
Click on the pencil icon to edit the specific Service Engine and match the MAC Address from the Service Engine VM using the vSphere UI. Once you have identified the two interfaces, you can decide if you wish to use DHCP and/or Static IP Address. In this example, I am using DHCP for the nsx-alb network while a static IP for the tkg-network.
At this point, you have completed all the necessary configuration for TKG to now start using NSX ALB
Setup TKG with NSX ALB
Step 1 - We need to base64 encode the NSX ALB Controller Certificate that we retrieved from the above Step 7. You can use base64 utility if you are on a MacOS or Linux system or simply use an online service such as https://www.base64encode.org/ and save the results which will be used in next step.
Step 2 - We now need to add the following to our TKG Management Cluster deployment YAML manifest including the base64 encoded certificate and NSX ALB configuration.
AVI_ENABLE: "true"
AVI_CONTROLLER: "192.168.3.2"
AVI_USERNAME: "admin"
AVI_PASSWORD: "....."
AVI_CA_DATA: "LS0t......."
AVI_CLOUD_NAME: "Default-Cloud"
AVI_SERVICE_ENGINE_GROUP: "Default-Group"
AVI_DATA_NETWORK: "NSX-ALB-Segment"
AVI_DATA_NETWORK_CIDR: "192.168.3.0/24"
Note: I will assume the reader is already familiar with the process of deploying a standard TKG Management Cluster. If you are not, I recommend taking a look at the TKG Demo Appliance Fling which includes a step by step workshop guide for deploying TKG on VMware Cloud on AWS. Using the Fling, you can simply replace the AVI section within the sample vmc-tkg-mgmt-template.yaml file.
Step 3 - Once the TKG Management Cluster has been setup, we can confirm that everything was configured correctly by verifying the NSX ALB K8s operator is properly running by issuing the following command:
kubectl -n tkg-system-networking get pod
Request K8s Load Balancer Service
Step 1 - Create a new TKG Workload Cluster and ensure that the deployment YAML manifest also contains the same NSX ALB configuration from Step 2 of Setup TKG with NSX ALB
Note: I will assume the reader is already familiar with the process of deploying a standard TKG Workload Cluster. If you are not, I recommend taking a look at the TKG Demo Appliance Fling which includes a step by step workshop guide for deploying TKG on VMware Cloud on AWS. Using the Fling, you can simply replace the AVI section within the sample vmc-tkg-workload-1-template.yaml file.
Step 2 - Deploy a K8s service of type LoadBalancer and observe the NSX ALB integration with TKG
Option A: If you are using the TKG Demo Appliance Fling, included within the appliance is several demos. You can navigate to /root/demo/yelb and then run the following to deploy the
kubectl create ns yelb
kubectl apply -f demo/yelb/yelb-lb.yaml
To retrieve the provisioned load balancer IP from NSX ALB, run the following command:
kubectl -n yelb get svc
You can now open a browser to IP Address and the Yelb application should now load which is being serve by NSX ALB!
We can also view the provisioned service within NSX ALB UI by navigating to Applications->Dashboard and get additional information such as the health, logs, events, etc.
Note: If you are unable to access the IP Address that was provisioned by NSX ALB, verify that Step 12 from the initial network mapping is correct. I initially ran into an issue here and you can verify connectivity by SSH'ing to the Service Engine VM and see if you can ping the TKG Control Plane or Worker Nodes. If this is not successful, then you either have an incorrect network mapping or you did not connect the Service Engine VM to TKG Network.
Option B: Create lb.yaml file that contains the following snippet
apiVersion: v1 kind: Service metadata: name: lb-svc spec: selector: app: lb-svc ports: - protocol: TCP port: 80 targetPort: 80 type: LoadBalancer --- apiVersion: apps/v1 kind: Deployment metadata: name: lb-svc spec: replicas: 2 selector: matchLabels: app: lb-svc template: metadata: labels: app: lb-svc spec: serviceAccountName: default containers: - name: nginx image: gcr.io/kubernetes-development-244305/nginx:latest
and then deploy using the following command:
kubectl apply -f lb.yaml
To retrieve the provisioned load balancer IP from NSX ALB, run the following command:
kubectl get svc
You now simply curl or open the IP Address in a web browser to confirm functionality.
We can also view the provisioned service within NSX ALB UI by navigating to Applications->Dashboard and get additional information such as the health, logs, events, etc.
Rishab Mehta says
From being a consumer of your blog, to you writing about a product that I work on (I am part of Avi team), things have changed alot in the last few years.