Right before the holiday, I had spent some time exploring Tanzu Application Platform (TAP), which also recently GA'ed. TAP provides developers with an application-aware platform that focuses on making the developer experience easy for developing, building and running applications on Kubernetes.
If you are interested in a quick technical deep dive into TAP, check out this video by Scott Sisil, introducing TAP:
One of the core components of TAP is the Cloud Native Runtime (CNR), which is VMware's commercial offering of the popular open source project Knative. The VMware Event Broker Appliance (VEBA) project also makes use of Knative as our backend to provide customers with an event-driven automation solution.
Early on in the VEBA project, we knew that we wanted to develop and innovate with the community in the open but we also understood there would be users who would want an officially supported offering that they can call or file support requests when needed. Early last year, Michael Gasch, the lead architect for VEBA started to port the code from the VMware Event Router, which is the heart of VEBA into CNR's Tanzu Sources for vSphere and start unifying the two code bases. The goal is to ensure that users of the open source project VEBA will also have a consistent user experience in terms of function deployment when using the commercial offering.
As shared back in Dec, I was able to successfully deploy TAP, CNR and Sources for vSphere all running in Tanzu Community Edition (TCE), which is a completely free Enterprise-grade Kubernetes available to anyone in the community to use. For those interested, you can find the instructions below on how to deploy and configure TAP to enable vSphere event-driven automation capabilities for your infrastructure. If you are interested in deploying this using the Tanzu Kubernetes Grid (TKG) Service, check out this other recent blog post that outlines the specific steps.
✅Tanzu Community Edition (TCE) on #VMWonAWS
✅ Tanzu Application Platform
✅ Cloud Native Runtime
✅ Sources for vSphere
✅ VMC vCenter Events via Sockeye
✅ Powershell function to notify via Slack when VM Powered Off (existing #VEBA function)Will blog details post-holiday! pic.twitter.com/Rhoca951Yj
— William Lam (@lamw.bsky.social | @*protected email*) (@lamw) December 14, 2021
TAP and CNR Installation
Step 1 - Download the latest TCE version of Tanzu CLI (tce-linux-amd64-v0.9.1.tar.gz) from Github and the latest Tanzu Cluster Essentials (tanzu-cluster-essentials-linux-amd64-1.0.0.tgz) from the Tanzu Network and transfer the files to the computer which will have access to your TCE Workload Cluster.
Step 2 - Clone the following Github repository which contains all the sample YAML files that will be referenced.
git clone https://github.com/lamw/vsphere-event-driven-automation-tap.git
Step 3 - Extract the contents of the two files into a tanzu and tanzu-cluster-essentials directory.
mkdir tanzu && mkdir tanzu-cluster-essentials tar -zxvf tce-linux-amd64-v0.9.1.tar.gz -C tanzu tar -zxvf tanzu-cluster-essentials-linux-amd64-1.0.0.tgz -C tanzu-cluster-essentials
Step 4 - Install the TCE version of Tanzu CLI.
cd tanzu/tce-linux-amd64-v0.9.1 ./install.sh
Step 5 - Deploy a TCE Management Cluster if you have not already. A minimum of 1 x Control Plane and 1 x Worker Node is required with at least 2 vCPU and 8GB of memory. You can refer to the following tce-mgmt-cluster-sample.yaml example and modify based on your environment.
Note: In the example below, I am deploying TCE Management Cluster to a VMware Cloud on AWS environment but this can also be used with a traditional vSphere on-premises environment.
tanzu mc create -f tce-mgmt-cluster-example.yaml
Step 6 - Deploy a TCE Workload Cluster if you have not already. A minimum of 3 x Control Plane and 3 x Worker Node is recommended with at least 4 vCPU and 8GB of memory. You can refer to the following tce-workload-cluster-sample.yaml example and modify based on your environment.
Note: In the example below, I am deploying TCE Workload Cluster to a VMware Cloud on AWS environment but this can also be used with a traditional vSphere on-premises environment.
tanzu cluster create -f tce-workload-cluster-example.yaml
Step 7 - Next, we need to pause and delete the existing kapp-controller in the TCE Management Cluster or else it will revert changes when installing TAP in our TCE Workload Cluster. The example below assumes TCE Workload Cluster is named tce-wl-01, if it is different, please modify based on your environment.
kubectl config use-context tce-mgmt-admin@tce-mgmt kubectl patch app/tce-wl-01-kapp-controller -n default -p '{"spec":{"paused":true}}' --type=merge
Step 8 - Change into the TCE Workload Cluster context and then delete the existing kapp-controller deployment and deploy the latest kapp-controller (v0.29.0 as of writing this blog post).
tanzu cluster kubeconfig get tce-wl-01 --admin kubectl config use-context tce-wl-01-admin@tce-wl-01 kubectl delete deployment kapp-controller -n tkg-system kubectl apply -f https://github.com/vmware-tanzu/carvel-kapp-controller/releases/download/v0.29.0/release.yml
Step 9 - Disable the default Pod Security Policy (PSP).
kubectl apply -f disable_psp.yaml
Step 10 - Install Carvel secretgen-controller which is required for TAP.
kubectl create ns secretgen-controller kubectl apply -f https://github.com/vmware-tanzu/carvel-secretgen-controller/releases/latest/download/release.yml
Step 10 - Setup the required environment variables that will be used to install TAP, which will also be referenced later during the deployment. You will need to replace the values in both INSTALL_REGISTRY_USERNAME and INSTALL_REGISTRY_PASSWORD with the username and password you use to log into Tanzu Network.
export INSTALL_BUNDLE=registry.tanzu.vmware.com/tanzu-cluster-essentials/cluster-essentials-bundle@sha256:82dfaf70656b54dcba0d4def85ccae1578ff27054e7533d08320244af7fb0343 export INSTALL_REGISTRY_HOSTNAME=registry.tanzu.vmware.com export INSTALL_REGISTRY_USERNAME='TANZU-NET-USER' export INSTALL_REGISTRY_PASSWORD='TANZU-NET-PASSWORD'
Step 11 - Create a TAP registry secret that will be used when installing packages from TAP and then add the current TAP 1.0 repository.
kubectl create ns tap-install tanzu secret registry add tap-registry \ --username ${INSTALL_REGISTRY_USERNAME} --password ${INSTALL_REGISTRY_PASSWORD} \ --server ${INSTALL_REGISTRY_HOSTNAME} \ --export-to-all-namespaces --yes --namespace tap-install tanzu package repository add tanzu-tap-repository \ --url registry.tanzu.vmware.com/tanzu-application-platform/tap-packages:1.0.0 \ --namespace tap-install
To confirm that everything was configured correctly, you can run the following command and to ensure the status shows reconcile succeeded.
tanzu package repository get tanzu-tap-repository --namespace tap-install
Step 12 - Install the CNR 1.1.0 package using the basic values file (cnr-values.yaml).
tanzu package install cloud-native-runtimes -p cnrs.tanzu.vmware.com -v 1.1.0 -n tap-install -f cnr-values.yaml --poll-timeout 40m
Note: Depending on the available resources, it is possible that the timeout value may be reached before all CNR packages are installed and running. If this happens, you can retry operation.
Step 13 - Install the remainder dependencies (RabbitMQ Operator, Cert Manager & Messaging Topology Operator) for CNR.
kubectl apply -f https://github.com/rabbitmq/cluster-operator/releases/download/v1.6.0/cluster-operator.yml kubectl wait pod --timeout=3m --for=condition=Ready -l '!job-name' -n rabbitmq-system kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.2.0/cert-manager.yaml kubectl wait pod --timeout=3m --for=condition=Ready -l '!job-name' -n cert-manager kubectl apply -f https://github.com/rabbitmq/messaging-topology-operator/releases/download/v0.8.0/messaging-topology-operator-with-certmanager.yaml kubectl wait pod --timeout=3m --for=condition=Ready -l '!job-name' -n rabbitmq-system
Step 14 - Create the vmware-functions namespace which will be used to deploy vSphere Sources and your functions.
kubectl create ns vmware-functions
Step 15 - Create a TAP registry secret for the vmware-functions namespace and update the default service account to reference the credential which will be used when the CNR components are instantiated.
kubectl -n vmware-functions create secret docker-registry registry-credentials \ --docker-server "${INSTALL_REGISTRY_HOSTNAME}" \ --docker-username "${INSTALL_REGISTRY_USERNAME}" \ --docker-password "${INSTALL_REGISTRY_PASSWORD}" kubectl patch serviceaccount -n vmware-functions default -p '{"imagePullSecrets": [{"name": "registry-credentials"}]}'
Step 16 - Deploy RabbitMQ and Broker.
kubectl apply -f rabbit.yaml kubectl wait pod --timeout=3m --for=condition=Ready -l '!job-name' -n vmware-functions
Step 17 (Optional) - Deploy Sockeye which provides a graphical interface for easily viewing vSphere Events in a browser.
kubectl apply -f sockeye.yaml
Step 18 - Update the vsphere-secret.yaml example with the credentials of the vCenter Server you intend to use and then run the following command to create the vSphere Secret.
kubectl -n vmware-functions apply -f vsphere-secret.yaml
Note: A big benefit of having the vSphere Secret created separately is that you can re-use common credentials such as a shared read-only service account which is the minimum vSphere Role that is required for this solution.
Step 19 - Update the vsphere-source.yaml example with the vCenter Server FQDN or IP Address that you intend to use and then run the following command to create the vSphere Source instance.
kubectl -n vmware-functions apply -f vsphere-source.yaml
Step 20 - Patch the vSphere Source service account (e.g. vsphere-01-serviceaccount) to reference the TAP registry credentials and re-deploy the vSphere Source. Unfortunately, this can not be done prior as the service account does not exists until Step 15 has been completed and as expected, it will fail as it does not have the correct TAP credentials until this step is performed.
kubectl patch serviceaccount -n vmware-functions vsphere-01-serviceaccount -p '{"imagePullSecrets": [{"name": "registry-credentials"}]}' kubectl -n vmware-functions delete deployment/vsphere-01-deployment
At this point, you have successfully deployed and configured TAP, CNR and vSphere Sources. You can now run the following command to verify that all deployments are running as shown in the screenshot below.
kubectl get deployments -A
Note: TCE does not include an out of the box kubernetes service load balancer. If you wish to access Sockeye or HTTP endpoints of your functions, you will need to deploy a service load balancer. If you have access to NSX Advanced Load Balancer (NSX-ALB), you can use that or if you are looking for a free alternative, you can consider either kube-vip or Metallb.
Function Deployment
With your vSphere Sources configured, you can now start writing and deploying functions that can react to specific vSphere Events. If you are familiar with VEBA functions, you can create simliar functions using any of the supported languages including PowerShell/PowerCLI, Python and Go. The main difference with a function that would run with vSphere Sources versus using VEBA is that instead of using the subject field to annotate the specific vSphere Event you wish to react to, we must now use the type field and follow a new format for specifying the vSphere Event. For example, instead of VmPoweredOffEvent it would be com.vmware.vsphere.VmPoweredOffEvent.v0
To demonstrate a simple function deployment, we will send a Slack notification whenever a VM is powered off. You will need access to Slack and the webhook URL for the desired Slack channel to send the notification. For obtaining the slack Webhook URL, please refer to the official Slack documentation.
Step 1 - Update the slack-secret.json file with Slack webhook URL and then run the following command to create the Slack secret and deploy our Slack function which will react to VMs being powered off in your vCenter Server
kubectl -n vmware-functions create secret generic slack-secret --from-file=SLACK_SECRET=slack-secret.json kubectl -n vmware-functions apply -f function.yaml
Step 2 - Run the following command to verify the function was deployed correctly by looking at the status column for tap-ps-slack-trigger and ensure it has a value of True.
kubectl get trigger -A
Step 3 - Finally, login to the vSphere UI of your vCenter Server and power off a VM and you should see a Slack notification similar to the example below.
PK says
Thanks for the great writeup as always. in your git repo, it appears the CLUSTER_NAME and CLUSTER_PLAN are reversed in the config files tce-mgmt-cluster-sample.yaml and tce-workload-cluster-sample.yaml
William Lam says
Thanks for the catch PK! I've just fixed