Exploration of Tanzu Kubernetes Grid (TKG) multi-vCenter Server templating using YTT

The motivation behind this blog post originates from a really cool blog post by Mike Brown who shared an interesting Telco use case for wanting to running Tanzu Kubernetes Grid (TKG) on VMware Cloud on AWS (VMConAWS) and centrally managing TKG Workload Clusters, which would run at each individual Edge/Cell Site location.

Awesome post from @vcdx71, lots of great nuggets! https://t.co/1tPFv1kpHf

1) 🔥@VMwareTanzu Kubernetes Grid (TKG) w/multi-vCenter Servers

2) 📈 Continue adoption of #VMWonAWS for DC Evac & extending to Edge Mgmt

3)📡 Cell Site/RAN mention,♥️ innovations from Telco customers

— William Lam (@lamw.bsky.social | @*protected email*) (@lamw) July 13, 2021

While reading through Mike's blog post, I noticed one of the steps was to edit the generated YAML from the TKG Management Cluster which would then be used to deploy the individual TKG Workload Clusters. This would need to happen for each new deployment 😮 and of course, this could be very error prone and frustrating for end users. Here is an example of what the YAML file looks like which is over 1K+ lines!

This screams for automation and I had been looking for a reason to try out YTT again, which is a YAML templating tool that is part of the open source project Carvel. Although I had played with YTT before, it did not feel intuitive, especially for a new user who was trying to solve a quick problem. I figured this was my opportunity to take another look at YTT.

After a couple of hours and a lot of trial/error, I ended up with a partial solution and realized that I would not be able to figure this out given there were even more complicated sections within the YAML. I felt the bar to getting started with YTT was still too high and it may not be the right tool for this particular situation. I opted for a quicker solution using sed, which I had experience with before, but I also know that depending on the problem, sed can be just as complex and I also dislike regular expressions 🙂

After sharing my 2nd experience with YTT on Twitter with a light hearted tweet 😉

I tried (really) tried using YTT yesterday to substitute a handful of parameters …. made some progress, but just ran into more errors. Also found constraint where you can’t ref variable in same data values file

Gave up and went back to the tried and true solution of … sed 🤣

— William Lam (@lamw.bsky.social | @*protected email*) (@lamw) July 11, 2021

I was surprised to hear from a number of folks from the YTT community who wanted to better understand the issues I was having and what could be done to improve the overall user experience. I was already interacting with a few of the YTT Engineers in the Carvel Slack Channel where I had already asked a few questions while trying to figure out the solution.

Dmitriy Kalinin, one of the Engineers working on YTT, had reached out on Slack and offered to help me if I was still interested in a YTT-based solution. If nothing else, he also wanted to understand where the gaps were and how YTT could be improved. I kindly took Dmitriy's offer since I was still interested in a YTT solution but more importantly, I also have been thinking about a way to simplify the YAML management for our VMware Event Broker Appliance (VEBA) project and I was interested to see if YTT could help.

A meeting was setup with Dmitriy where we spent ~30min+ stepping through each section of the original YAML that I wanted to transform. Dmitriy not only helped guide me expand and improved my original solution, but he also spent the time explaining some of the fundamentals of YTT as it pertained to the problem I was trying to solve. For those who are simply interested in the final solution, you can head over to this Github repo https://github.com/lamw/tkg-multi-vcenter-ytt which contains both the sed and YTT solution.

In the section below, I will step through each YTT file and explain what the syntax is doing for those interested in learning more.

Lets first start with the values.yaml file, as the name suggest, it contains all the values that you wish to replace within the supplied base YAML called tkg-cluster-01-BASE.yaml. This is the only file that needs to be edited by a user and the rest is taken care of by YTT which will handle the replacement and construct a new YAML file that is ready to be used by TKG.

To reduce the amount of typing that a user needs to make due to the explicit vSphere inventory path values, we are taking advantage of both YTT variables (L4-6) and data values (L7-12). Currently, you can not self-reference a data value and hence YTT variables should also be used. In addition, YTT variables are scoped within a file and so ultimately, YTT data value is what needs to be defined so that we can reference them in our overlay.yaml file, which is responsible for processing our input YAML file.

A nice property of the values.yaml is that each deployment can be source controled and easily managed with only modifying a handful of values versus searching through a gigantic YAML file. Once the user values are provided, we simply transform the data (L15-19) to the expected paths for the various vSphere Inventory paths, which you have probably seen before if you have ever worked with TKG or using the vSphere API.

#@data/values
---
#! ytt variables
vcenter: "vcsa.vmware.corp"
tkgClusterName: "tkg-cluster-01"
networkName: "VM Network"
#! ytt data values
#@ datacenterName = "Palo-Alto"
#@ datastoreName = "vsanDatastore"
#@ resourcePoolName = "Cluster-01"
#@ folderName = "TKG"
#@ templateName = "photon-3-kube-v1.20.5_vmware.2"

#! #### Do Not Edit Beyond Here ####
datacenter: #@ "/" + datacenterName
datastore: #@ "/" + datacenterName + "/datastore/" + datastoreName
folder: #@ "/" + datacenterName + "/vm/" + folderName
resourcePool: #@ "/" + datacenterName + "/host/" + resourcePoolName
template: #@ "/" + datacenterName + "/vm/" + folderName + "/" + templateName

The overlay.yaml, is what contains the list of YTT operations to perform given the values.yaml and our YAML input file that we want to transform. In our example, we are simply replacing the values for specific keys to map to our destination vCenter Server where the TKG Workload Cluster will be deployed.

Here is a screenshot of a section of what the original YAML looks like on the left and on the right, the areas highlighted is what we want to modified with the respective values:

The following YTT statement will match on the VSphereMachineTemplate kind which has two entries in our base YAML, one representing the control plane and the other for the worker nodes. We use the expects=2 parameter to tell YTT, that it should only find two and if there are more, it should error out. Next, we need to replace a number of values which is annotated below and we are using the YTT data values that we had defined in values.yaml file. The substitution for this section is pretty straight forward and the only problem I had ran into was handling the network.devices section which is defined as an array and on L14, we tell YTT to match on all entries, in case there are multiple network adapters and that we can expect 1 or more entries. After that, we update the networkName property just like we did for the other keys.

#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate"}),expects=2
---
spec:
  template:
    spec:
      datacenter: #@ data.values.datacenter
      datastore: #@ data.values.datastore
      folder: #@ data.values.folder
      resourcePool: #@ data.values.resourcePool
      server: #@ data.values.vcenter
      template: #@ data.values.template
      network:
        devices:
        #@overlay/match by=overlay.all, expects="1+"
        -
          networkName: #@ data.values.networkName

Next, here is a screenshot of another section of what the original YAML looks like on the left and on the right, the areas highlighted is what we want to modified with the respective values:

In this example, although we are simply modifying two values, you will see that the contents of stringData is a bit more complex because it actually contains embedded YAML content. I also knew this was going to be a bit tricky when Dmitriy asked if I had seen the movie Inception before? 🙂

The following YTT statements uses a custom function called update_vsphere_cpi_conf (L7-10) which then decodes the embedded YAML so that it can replace the actual values which is provided by another function called vsphere_cpi_conf_values that overlay entries. It then encodes the results back and appends the two additional YTT statements which was part of the original YAML file. This is certainly not a common pattern you would normally find, but good to see that YTT is powerful enough to also tackle this issue.

#@ def vsphere_cpi_conf_values():
vsphereCPI:
  server: #@ data.values.vcenter
  datacenter: #@ data.values.datacenter
#@ end

#@ def update_vsphere_cpi_conf(old, _):
#@   header = "#@data/values\n#@overlay/match-child-defaults missing_ok=True\n---\n"
#@   return header+yaml.encode(overlay.apply(yaml.decode(old.split("---")[1]), vsphere_cpi_conf_values()))
#@ end

#@overlay/match by=overlay.subset({"kind":"Secret", "metadata": {"name": data.values.tkgClusterName+"-vsphere-cpi-addon"}})
---
stringData:
  #@overlay/replace via=update_vsphere_cpi_conf
  values.yaml:

Here is a screenshot of another section of what the original YAML looks like on the left and on the right, the areas highlighted is what we want to modified with the respective values:

Similar to the previous example, the following YTT statements will look for the specific Secret and then perform a similar transformation for the embedded YAML and then replace the three desired keys.

#@ def vsphere_csi_conf_values():
vsphereCSI:
  server: #@ data.values.vcenter
  datacenter: #@ data.values.datacenter
  publicNetwork: #@ data.values.networkName
#@ end

#@ def update_vsphere_csi_conf(old, _):
#@   header = "#@data/values\n#@overlay/match-child-defaults missing_ok=True\n---\n"
#@   return header+yaml.encode(overlay.apply(yaml.decode(old.split("---")[1]), vsphere_csi_conf_values()))
#@ end

#@overlay/match by=overlay.subset({"kind":"Secret", "metadata": {"name": data.values.tkgClusterName+"-vsphere-csi-addon"}})
---
stringData:
  #@overlay/replace via=update_vsphere_csi_conf
  values.yaml:

Lastly, here is a screenshot of the final section of YAML that needs to be transformed. On the left, the original YAML and on the right, the section highlighted is what we want modified with the respective values:

The following YTT statement looks for the VSphereCluster kind and filters based on the expected metadata.name value, which in our case, is the name of the TKG Workload Cluster and then performs a simple replacement.

#@overlay/match by=overlay.subset({"kind":"VSphereCluster", "metadata": {"name": data.values.tkgClusterName}})
---
spec:
  server: #@ data.values.vcenter

Putting all of this together, we now have our final overlay.yaml and combining that with our custom values.yaml, we can now easily transform and get the desired YAML output by running the following command:

ytt -f values.yaml -f overlay.yaml -f tkg-cluster-01-BASE.yaml

Although I have a working YTT solution, I definitely would not have figured this out by myself and for this particular problem, I think a simple sed one-liner would still be my preferred solution.

With that said, I definitely have a better appreciation for what YTT can do and I also want to thank Dmitriy for his time in educating and helping me with the final solution. I also really appreciated that YTT maintainers were were genuinely interested in learning how they could learn from my own experience and things that they could improve. In fact, while diving through each section, there were several take aways on areas that could be improved such as error handling, which can certainly make or break an experience when a user runs into a problem.

YTT may not be my go to solution for this specific issue but I plan on evaluating it for our VEBA project!

More from my site

Thanks for the comment!Cancel reply