I have been working with Tanzu Kubernetes Grid (TKG) quite a bit lately and using their new slick TKG CLI for deploying standalone Tanzu Kubernetes Clusters (TKC) which can run in both VMware Cloud on AWS as well as your on-premises vSphere 6.7 Update 3 environment. If you have vSphere 7 and the vSphere with Kubernetes capability, it also supports TKG deployments natively as part of that solution but you can also use TKG CLI to deploy TKC's.
Out of the box, TKG includes all the necessary software components to deploy a production grade, upstream and conformant Kubernetes distribution. For most customers, the "batteries included" type of offering is more than sufficient but for some customers who may wish to customize some of these components further when running the standalone distribution. One such example is swapping out the default Container Network Interface (CNI) which uses Calico for a different CNI with more capabilities.
As you may have guess from the title of this post, we will be replacing Calico with Antrea which is another open source CNI. In fact, Antrea was started by VMware last year and uses Open vSwitch (OVS) to provide network and security capabilities to Kubernetes. You can read more about Project Antrea here and more details about its architecture can be found here.
Disclaimer: This is currently not officially supported by VMware. I do know the TKG team is looking at Antrea support in the future.
Today, the process of changing out the CNI in TKG is a manual operation but in the future this could certainly be made simpler. When you install TKG CLI, there are two default plans: dev and prod which are encoded in two yaml files under ~/.tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template*.yaml
The section at the very bottom contains the Calico CNI definition and if we want to deploy another CNI such as Antrea, we need to replace everything after the "---" section with our CNI deployment specification.
--- apiVersion: v1 kind: Secret metadata: name: ${CLUSTER_NAME}-postcreate namespace: ${NAMESPACE} stringData: calicoYaml: | ---
This is certainly not something you want to be doing by hand and this is where some automation will help. Below are the commands that will automatically create two new plans called dev-antrea and prod-antrea which are clones of the existing plan and append the Antrea YAML so that you will be able to leverage the TKG CLI to deploy TKC running Antrea as the default CNI.
Step 1 - Run the following two command which will create to new YAML files and automatically delete everything after the "---" as mentioned earlier.
sed '/calico-config/,$d' .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-dev.yaml > .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-dev-antrea.yaml
sed '/calico-config/,$d' .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-prod.yaml > .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-prod-antrea.yaml
Step 2 - Run the following command to download the latest Antrea deployment specification. If the system does not have internet access, you can manually download this file
wget https://raw.githubusercontent.com/vmware-tanzu/antrea/master/build/yamls/antrea.yml
Step 3 - Run the following command which will take the contents of the Antrea deployment specification and append that to each of the our new deployment plans.
awk '{print " ", $0}' antrea.yml >> .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-dev-antrea.yaml
awk '{print " ", $0}' antrea.yml >> .tkg/providers/infrastructure-vsphere/v0.6.3/cluster-template-prod-antrea.yaml
rm antrea.yaml
We now have our two new plans and we can deploy a new TKC using TKG CLI with the following command:
tkg create cluster --plan=dev-antrea tkg-cluster-02
Once our new TKC is ready, we can list all the pods and we should now see Antrea as the default CNI:
After getting familiar with Antrea, I thought this would also be a good time to revisit an existing enhancement that we have been wanting to make to our vCenter Event Broker Appliance (VEBA) Fling which currently uses Weave for its CNI. As you can see from the screenshot below, I have already been able to get this working with VEBA, I am now just waiting for their official v0.6.0 release before we incorporate that into VEBA.
Vincent Chan says
Please provide above yaml file in Github for making the study easier! Thanks!