Complete vSphere with Tanzu homelab with just 32GB of memory!

11.09.2020 by William Lam // 43 Comments

Since the release of vSphere 7.0 Update 1, the demand and interests from the community on getting hands on with vSphere with Tanzu and the new simplified networking solution, has been non-stop. Most folks are either upgrading their existing homelab or looking to purchase new hardware that can better support the new features of the vSphere 7.0 release.

Although vSphere with Tanzu now has a flavor that does not require NSX-T which helps reduces the barrier on getting started, it still has some networking requirements which may not be easily met in for all lab environments. In fact, this was actually the primary reason I had started to look into this since my personal homelab network is very basic and I do not have nor want a switch that can support multiple VLANs, which is one of the requirements for vSphere with Tanzu.

While investigating for a potential solution, which included way too MANY hours of debugging and troubleshooting, I also thought about the absolute minimal amount of resources I could get away with after put everything together. To be clear, my homelab is comprised of a single Supermicro E200-8D which has 128GB of memory and that has served me well over the years and I highly recommend it for anyone that can fit that into their budget. With that said, I did set out with a pretty aggressive goal of using something that is pretty common in VMware homelabs which is an Intel NUC and with just 32GB of memory.

UPDATE (07/02/24) - As of vSphere 8.0 Update 3, you no longer have the ability to configure a single Supervisor Control Plane VM using the minmaster and maxmasters parameters, which have also been removed from /etc/vmware/wcp/wcpsvc.yaml in favor of allowing users to control this configuration programmatically as part of enabling vSphere IaaS (formally known as vSphere with Tanzu). The updated vSphere IaaS API that allows users to specify number of Supervisor Control Plane VM will not be available until the next major vSphere release. While this regressed capability is unfortunate, it was also not an officially supported configuration and for users who wish to specify the number of Supervisor Control Plane VM using YAML method, you will need to use an earlier version of vSphere.

UPDATE (09/17/22) - The steps outlined in this blog post is also applicable for running vSphere with Tanzu using vSphere 8.

Sweet! Looks like running a complete vSphere w/Tanzu lab with with just 32GB of memory (https://t.co/jisSFCTYKM) still works with #vSphere8 🙌

I'm using my latest Intel NUC 12 Pro https://t.co/DEgOQnF3zr but any prior generation works too pic.twitter.com/oGPtzHzJhC

— William Lam (@lamw.bsky.social | @*protected email*) (@lamw) September 17, 2022

Here is the hardware BOM (similar hardware should also work):

Intel NUC 10i7FNH
32GB memory
Single 250GB M.2 NVMe SSD
- NUC can support two SSD (M.2 + SATA), you can always go larger

Here is the software BOM:

Note: The Intel NUCs (Gen 6 to 10) can all support up to 64GB of memory and this is one of the best upgrades you can give yourself, but if you only have 32GB of memory, this will also work.

The final solution will comprise of the following:

1 x vCenter Server Appliance (VCSA) running on the Intel NUC self-managing the ESXi host
VMFS storage will be used instead of vSAN to reduce memory footprint (If you have 64GB of memory, recommend using vSAN)
Onboard NIC will be used for all traffic and will be attached to a Distributed Virtual Switch (VDS)
- 3 x Distributed Portgroups will be configured on top of your existing LAN network, the latter two will be routed through our Photon OS Router VM
  - Management - Existing LAN network
  - Frontend - 10.10.0/24
  - Workload - 10.20.0.0/24
1 x vSphere with Tanzu Cluster enabled with Workload Management
1 x HAProxy VM deployed using 3-NIC configuration
1 x Photon OS Linux VM used as a Router for IP forwarding and optionally, a DNS server if you do not already have one
9 x IP Addresses in total will be required from your local LAN network
- 4 x IP Addresses which should map to following hostnames or similiar
  - esxi-01.tanzu.local
  - vcsa.tanzu.local
  - router.tanzu.local
  - haproxy.tanzu.local
- 5 x IP Addresses in a consecutive block (e.g. 192.168.30.20-192.168.30.25) will be needed for the Supervisor Control Plane VMs

As part of this solution, I have automated as much of the tasks as possible and all scripts used for this solution can be found at https://github.com/lamw/vsphere-with-tanzu-homelab-scripts which I will be referencing throughout the instructions. There are also a number of techniques and tricks I am using to be able to reduce the overall memory footprint for setting up vSphere with Tanzu, obviously these should not be used in a Production grade environment.

I also want to give a huge thanks to Timo Sugliani for all of his help with the networking question/challenges and Mayank B. from the vSphere with Tanzu Engineering team who helped with the debugging and ultimately making this solution a possibility.

Step 1 - Install ESXi 7.0 Update 1 onto your Intel NUC or any other system that you will be using for this setup.

Step 2 - Next, we will setup a Photon OS VM to serve as our "Router" which we will call RouterVM from here on out. It will also serve as our lab DNS server in case you do not already have one running. Download the latest Photon OS 3.0 OVA and deploy that to ESXi host. You can either use OVFTool (which you will need to install on your local machine if you do not have that) and use the deploy_photon_router_ova.sh shell script with modifications or you can simply use the the ESXi Embedded Host Client UI (open browser to the IP Address of your ESXi host).

Step 3 - Power on the VM and once it has booted, login using the VM Console. Credentials are root/changeme and then change the password as instructed.

Step 4 - Download and upload the setup_photon_router.sh shell script to the RouterVM. Edit the shell script and update the variables to match your environment. If you decide that you want DNS server which uses unbound to be setup, you will need to edit the configuration to reflect the desired hostname and IP Addresses. The script will configure the following:

Enable IP forwarding by setting net.ipv4.ip_forward = 1
Configure static IP Address for eth0 and then create entries for both eth1 (10.10.0.1) and eth2 (10.20.0.1) which will be added later and you should not touch these addresses within the script unless they conflict with your existing network
Configure iptables (please do not disable) for connectivity but also enabling IP Masquerade which will allow our private addresses (10.10.0.0/24 & 10.20.0.0/24) to connect outbound to internet (if needed)

It is important that you run the script from inside the VM Console, since it will be changing the IP Address of eth0 and cause your SSH session to disconnect. After the script has completed, you can verify that everything was configured by ensuring DNS lookup is working for both external sites as well as the chosen hostnames if you are using the RouterVM as your local lab DNS server. The IP Address should be whatever IP you selected for the RouterVM, in my example this is 192.168.30.2

nslookup vmware.com 192.168.30.2
nslookup vcsa.tanzu.local 192.168.30.2
nslookup 192.168.30.5 192.168.30.2

Step 4 - With our RouterVM configured (since I was using it for local DNS), we can now deploy the VCSA. Download VCSA 7.0 Update 1 ISO and extract the contents onto your local desktop. You can install VCSA using either the UI Installer or the CLI Scripted Installer. I generally prefer the CLI, not only for Automation purposes but the entire configuration is encoded into a single JSON configuration file which can be used to re-deploy or you can save for future references. Download the vcsa.tanzu.local.json JSON template and update it to reflect your environment settings.

To deploy, you will need to change into vcsa-cli-installer directory from your desktop OS (Windows, Mac or Linux) and then run the following command and pass in the path to the JSON configuration file:

./vcsa-deploy install --accept-eula --acknowledge-ceip --no-ssl-certificate-verification ~/vcsa.tanzu.local.json

This step will take some time and it also a good time for a water, coffee, tea or beer break depending on when you are doing this 🙂

Note: SSH should be enabled if you decide to deploy from the VCSA UI Installer

Step 5 - Once the VCSA has successfully completed installation, SSH to the VCSA using root and the password you had configured. You will be dropped into the appliancesh, simply type "shell" and hit enter to exit into standard bash shell.

Note: If you wish to disable the appliancesh, run the following command: chsh -s /bin/bash

Step 6 - Edit /etc/vmware/wcp/wcpsvc.yaml and replace the values for both minmasters and maxmasters from 3 to 1 and then save and exit. This change reduces the number of Control Plane VMs used for the Supervisor Cluster when enabling Workload Management.

UPDATE (09/28/21) - As of vSphere 7.0 Update 3, you can now have just a single Supervisor Control Plane VM

Note: For more details about this change, please see https://www.williamlam.com/2020/04/deploying-a-minimal-vsphere-with-kubernetes-environment.html

Step 7 - Login to ESXi Embedded Host Client and shutdown the VCSA VM and then change the memory from 12GB to 8GB and then power on back the VM.

Step 8 - Once the VCSA is back online and you login to the vSphere UI, download the setup_vcsa.ps1 PowerCLI script onto your desktop. This script will help setup your new VCSA with the following:

Disables the vCenter Network Rollback feature to allow for a single NIC VDS configuration
Creates vSphere Datacenter & Cluster (enables vSphere DRS/HA which is required for vSphere with Tanzu)
Adds the physical ESXi host to newly created vSphere Cluster
Creates Distributed Virtual Switch along with the following portgroups (Management, Frontend & Workload)

You technically only need to update the $VCSAHostname, $VCSAUsername, $VCSAPassword, $ESXiHostname & $ESXiPassword as the rest are simply labels of the vSphere objects that will be created for you as part of the automation. You can certainly change the names, but be aware that any changes will need to be reflected when enabling Workload Management.

To run the script, you will need to ensure you have both the latest PowerShell Core and PowerCLI installed on your system. To verify that Powershell is running, type pwsh and it should successfully open up a Powershell prompt. To run script, enter the following command:

./setup_vcsa.ps1

There should not be any errors after the script complete, if there is, double check your credentials. At this point, you can refresh the vSphere UI or log back in to see the newly created objects.

Step 9 - This step is ONLY required if you are using an Intel NUC or any other platform that only has a single onboard NIC that is connected. If you have more than one NIC and it is connected to same LAN network, then you simply attach the secondary NIC to the VDS that was created and move onto the next step.

Since vSphere with Tanzu requires the use of a VDS, we need to migrate both our ESXi management interface as well as the workloads to our newly created VDS. Since we only have a single NIC, after the initial migration of the ESXi host interface, we will lose connectivity to VCSA since the since NIC is providing both management and VM traffic. This is expected and there will be several steps to remediate this after the fact.

Navigate to the Networking view in the vSphere UI and right click on the VDS and select Add and Manage Hosts option

Select Add Hosts and then add our ESXi host using the "+" icon and click Next

On the physical adapters, select vmnic0 and then click on Assign uplink icon at the top of the menu as shown in the screenshot below.

On the vmkernel adapters, select vmk0 and then click on Assign port group icon at the top fo the menu as shown in the screenshot below.

We will skip Migrate VM networking as that will be done in the ESXi Embedded Host Client and then click finish to perform the migration from the Virtual Standard Switch (VSS) to VDS. It is expected that the VCSA is no longer accessible after you click finish, this is because our VCSA networking has not been updated to point to the new VDS port group which is connected to our NIC.

Lastly, login to ESXi Embedded Host Client and edit the VCSA VM and change the network adapter to now point to our Management port group.

After making the change, you should now be able to refresh the vSphere UI and VCSA should now respond again.

Note: For more details about this technique using a single NIC for VDS running VCSA on top, please see https://www.williamlam.com/2015/11/migrating-esxi-to-a-distributed-virtual-switch-with-a-single-nic-running-vcenter-server.html

Step 9 - Now that our VDS is properly configured, we will need to go back and edit our RouterVM so that eth0 points to the Management network and we need to add two additional network adapters: eth1 for Frontend and eth2 for Workload networks. Your configuration should match what is shown in the screenshot below.

Step 10 - We need to add a couple of static routes from our local desktop machine or from wherever you plan to connect and deploy your Tanzu Kubernetes Grid (TKG) Guest Clusters. These static routes will allow us to reach both our Frontend (10.10.0.0/24) and Workload (10.20.0.0/24) Networks using our RouterVM which have already been pre-created as part of the automation. For MacOS, the following two commands need to be run to create the required static routes and 192.168.30.2 is the RouterVM eth0 IP Adresss.

sudo route -n add -net 10.10.0.0/24 192.168.30.2
sudo route -n add -net 10.20.0.0/24 192.168.30.2

If you are on Windows, you can follow this guide here for creating static routes.

To verify that everything is working, you should now be able to successfully ping both 10.10.0.1 and 10.20.0.1 which are the respective gateways for each of our private networks.

Step 11 - Download the deploy_3nic_haproxy.ps1 PowerCLI script which will automate the deployment of the HAProxy OVA using a 3 NIC configuration. You will need to update the variables in the script to match your environment and the networks defined for both Frontend and Workload should be left alone if you have been following the remainder configurations.

Type pwsh in your terminal and it should successfully open up a Powershell prompt. To run script, enter the following command:

./deploy_3nic_haproxy.ps1

There should not be any errors after the script has completed.

Step 12 - Upload the setup_haproxy.sh shell script to to HAProxy VM and then execute the script as shown in the screenshot below. This disables the Reverse Path Filtering which is required to have proper network connectivity to our Frontend and Workload networks using this solution.

At this point, we can verify that all networking is configured correctly by performing the following ping tests from these source/destination systems:

From the RouterVM ping all HAProxy VM interfaces:

192.168.30.6
10.10.0.2
10.20.0.2

From the HAProxy VM (192.168.30.6) ping Router VM Frontend/Workload interfaces:

10.10.0.1
10.20.0.1

From your Desktop, ping HAProxy VM Frontend/Workload interfaces:

10.10.0.2
10.20.0.2

Step 10 - Lastly, before we can enable Workload Management on our vSphere with Tanzu Cluster, we need to create the required TKG Content Library. Navigate to Menu->Content Libraries and create a new subscribed library with the following URL: https://wp-content.vmware.com/v2/latest/lib.json and ensure content is downloaded immediately which is the default.

Depending on your internet connection, this can take some time to download which needs to be completed before you can progress to the next step. This is a good time to take another break

Step 12 - We are now finally ready to enable Workload Management on our vSphere with Tanzu Cluster! You now have the option of using the vSphere UI, which you can follow the Workload Management wizard OR if you prefer the "easy" button, then you can use my PowerCLI WorkloadManagement Module and you can install the module by simply running Install-Module VMware.WorkloadManagement

Below is the snippet of code which will connect to our VCSA and perform the enablement. You will need to replace the values with your own configuration, especially in the $vSphereWithTanzuParams which should reflect the earlier configuration. There should only be a handful of changes to replace the endpoints, credentials if you did not use the defaults and the local LAN addresses. The rest of the values should match the script defaults.

Connect-VIServer -Server vcsa.tanzu.local -User *protected email* -Password VMware1!
Connect-CisServer -Server vcsa.tanzu.local -User *protected email* -Password VMware1!
Import-Module VMware.WorkloadManagement

$vSphereWithTanzuParams = @{
    ClusterName = "Tanzu-Cluster";
    TanzuvCenterServer = "vcsa.tanzu.local";
    TanzuvCenterServerUsername = "*protected email*";
    TanzuvCenterServerPassword = "VMware1!";
    TanzuContentLibrary = "TKG-Content-Library";
    ControlPlaneSize = "TINY";
    MgmtNetwork = "Management";
    MgmtNetworkStartIP = "192.168.30.20";
    MgmtNetworkSubnet = "255.255.255.0";
    MgmtNetworkGateway = "192.168.30.1";
    MgmtNetworkDNS = @("192.168.30.2");
    MgmtNetworkDNSDomain = "tanzu.local";
    MgmtNetworkNTP = @("162.159.200.123");
    WorkloadNetwork = "Workload";
    WorkloadNetworkStartIP = "10.20.0.10";
    WorkloadNetworkIPCount = 20;
    WorkloadNetworkSubnet = "255.255.255.0";
    WorkloadNetworkGateway = "10.20.0.1";
    WorkloadNetworkDNS = @("10.20.0.1");
    WorkloadNetworkServiceCIDR = "10.96.0.0/24";
    StoragePolicyName = "Tanzu-Storage-Policy";
    HAProxyVMvCenterServer = "vcsa.tanzu.local";
    HAProxyVMvCenterUsername = "*protected email*";
    HAProxyVMvCenterPassword = "VMware1!";
    HAProxyVMName = "haproxy.tanzu.local";
    HAProxyIPAddress = "192.168.30.6";
    HAProxyRootPassword = "VMware1!";
    HAProxyPassword = "VMware1!";
    LoadBalancerStartIP = "10.10.0.64";
    LoadBalancerIPCount = 64
}

New-WorkloadManagement2 @vSphereWithTanzuParams

Note: For more details about automating the enablement of Workload Management with vSphere with Tanzu, please see https://www.williamlam.com/2020/10/automating-workload-management-on-vsphere-with-tanzu.html

The deployment will take some time and from my testing, it roughly takes ~20minutes when the Supervisor Cluster VMs initiate their OVF deployments. During this period, you will see a number of warnings and even errors, you can simply ignore as these will all go away upon a successful deployment. If your deployment takes more than 40min+, something has gone wrong and is most likely due to networking connectivity.

A successful deployment will show Config Status of Running and IP Address under the Control Plane Node column as shown in the example below. If you have not modified the default Frontend networks (which you should not have to), then the expected IP Address should be 10.10.0.64 which is the first starting address from our HAProxy Load Balancer configuration.

Step 13 - To deploy TKG Guest Cluster, you will need to configure a vSphere Namespace and download the kubectl-vsphere plugin to your local desktop. Under the Namespaces tab within the Workload Management UI, select our vSphere with Tanzu Cluster and provide a name. In my example, I am using primp-industries and the remainder of the example will assume this name, so make sure to replace it with your own name if you decide to use a different value.

Step 14 - After the vSphere Name has been created, click on Add Permissions to assign both the user *protected email* or any other valid user within vSphere to be able to deploy workloads.

Step 15 - Click on Edit Storage to assign the VM Storage Policy tanzu-Storage-Policy or any other valid VM Storage Policy.

Step 16 - With the introduction of VM Classes in vSphere 7.0 Update 2a, there is a behavior change where you need to explicitly associate the set of VM Classe(s) before you can use them. Under the VM Service within a vSphere Namespace, click on Manage VM Classes and then select all VM Classe(s) you wish to use. If you are no sure, I would recommend simply selecting them all or you will run into an issue where the TKC will simply not deploy.

Step 17 - Finally, click on the Open URL under the Namespace Status tile to download kubectl and the vSphere plugin and extract that onto your desktop.

Step 18 - Login to Supervisor Control Plane using the kubectl-vsphere plugin, the IP Address will be the one from Step 12.

./kubectl-vsphere login --server=10.10.0.64 -u *protected email* --insecure-skip-tls-verify

Step 19 - Switch context to our vSphere Namespace

./kubectl config use-context primp-industries

Step 20 - Download the tkc.yaml example which describes our TKG Guest Cluster deployment which will be a single control plane and single worker node. If you chose a different vSphere Namespace, make sure to edit the tkc.yaml to reflect those changes. To deploy, run the following command:

./kubectl apply -f tkc.yaml

This will take some time for both the Control Plane and Worker Node VM deploy, especially since the Worker Node can take a few more minutes after the Control Plane VM has been deployed. Please be patient but it should not take more than 10 minutes. You can monitor the progress using the vSphere UI or you can use kubectl and specify the name of your TKG Guest Cluster and wait until the Phase column shows running.

./kubectl get tanzukubernetescluster william-tkc-01

Step 21 - To use our TKG Guest Cluster, we need to login and switch into its context by running the following two commands:

./kubectl-vsphere login --server=10.10.0.64 -u *protected email* --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name william-tkc-01 --tanzu-kubernetes-cluster-namespace primp-industries
./kubectl config use-context william-tkc-01

Step 22 - Lets now deploy our first K8s application and instead of a boring wordpress or kuard (K8s up and running demo), lets try something a bit more interesting like say Doom 🙂 Yup, there's a K8s app for that called KubeDoom which I have written about here.

UPDATE (09/07/23) - If you deploy Metallb or another type of service load balancer, you can use the following kubedoom.yaml to expose the VNC service

You will need to install the git CLI and a VNC Client on your local desktop, I am using VNC Viewer but any VNC client will work. Run the following commands to clone the kubedoom repo and then deploy the K8s application:

git clone https://github.com/storax/kubedoom.git
cd kubedoom
kubectl apply -f manifest/

You verify that the deployment is successful when the pods go into a running state by using the following command:

kubectl -n kubedoom get pods

Note: For other fun and interesting K8s application and demos, please see https://www.williamlam.com/2020/06/interesting-kubernetes-application-demos.html

Step 23 - To connect to our kubedoom deployment from our local desktop, we need to port forward the pod to local port 5900 by running the following command:

kubectl -n kubedoom port-forward --address 0.0.0.0 deployment/kubedoom 5900:5900

Finally, open up your VNC Client and connect to localhost:5900 and the default password is idbehold

Comments

BT says

11/09/2020 at 6:35 pm

As usual that you for all the work you have done. Does it look that on step 4 the correct script to run should be setup_photon_router.sh?

Reply
- William Lam says
  
  11/09/2020 at 6:46 pm
  
  Yes, that's correct. Just fixed the link/name to the correct script
  
  Reply
Ivan Garcia says

11/10/2020 at 10:07 am

Hi William,
What about extensions? I'm trying to deploy harbor and contour (as prerequisite) and I can't deploy kapp-controller properly ... have you get success doing that?

Regards!

Reply
Tomas says

11/13/2020 at 4:48 am

Hi William,
With the latest NSX-T 3.1 not anymore require 2 VLANs needed for EN a TN VTEP. So basically ESX+NSX-T can be deployed in single Management VLAN, where for the VTEPs you may use isolated/not-routed alien subnet just between ESXi and EDGE.

With that all the HA-proxy and photon-os routing might be avoided. Obvously, home lan would need +80GB more memory for NSX-M and 2 EDGEs.

I'm running mine on 256/2CPU ASUS Z10PA-D8.

Support TEPs in different subnets to fully leverage different physical uplinks
https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/rn/VMware-NSX-T-Data-Center-31-Release-Notes.html

Reply
jpowell5050 says

11/14/2020 at 9:25 pm

Great article! Quick Qiestion: On step 12, I am going to GUI route v automation. Under the load balancer section, I cannot determine what I should put in the following two fields: Data Plane API Addresses & Server Certificate Authority. Any ideas?

Reply
- jpowell5050 says
  
  11/14/2020 at 9:45 pm
  
  OK....I think I have the data plane figured out: 192.168.30.6:5556
  However, I can't figure out where to get the Server CA information so I can continue to the next step. Can you provide direction?
  
  Reply
  - William Lam says
    
    11/15/2020 at 7:22 am
    
    You'll need to SSH to HAProxy VM and look in /etc/haproxy/ca.crt
    
    If you're not going to use the Automation mentioned in the article, please refer to the official documentation for further instructions
    
    Reply
    - jpowell5050 says
      
      11/15/2020 at 6:59 pm
      
      That got me over the hump, and I am following the automation except in Step 12 where you state:
      
      You now have the option of using the vSphere UI, which you can follow the Workload Management wizard OR if you prefer the "easy" button
      
      I am using the UI in vSphere for the experience 🙂
      
      When I click "Add" under Workload Network, the portgroup section is blank. I have the vDS and vPG are present in the vCenter networking. I have been searching all over and can't find this listed anywhere. Any guidance?
      
      Reply
  - Kisung says
    
    11/15/2020 at 7:44 am
    
    you can find CA info from HAproxy VM at /etc/haproxy/ca.crt. it should work.
    
    Reply
    - jpowell5050 says
      
      11/15/2020 at 6:32 pm
      
      Thanks!
      
      Reply
Kisung says

11/16/2020 at 5:48 am

Hi Will, Thanks for sharing such a valuable assets,
howerver, i keep facing the issue with Get-ContentLibrary in VMware.WorkloadManagement.
Seems like it doesn't have Get-ContentLibrary Module and even when i import module from https://github.com/vmware/PowerCLI-Example-Scripts/blob/master/Modules/ContentLibrary/ContentLibrary.psm1, the syntax is subtle different (ex, Get-Contentlibrary A / Get-Contentlibrary -Name A) . do you have any idea on this?

Reply
- William Lam says
  
  11/16/2020 at 6:07 am
  
  This is NOT using my CL Module, it is the official VMware Content Library cmdlet. Make sure you've got the latest PowerCLI version installed (you may also need to unload my module, IIRC, there was some conflict in naming when VMware released their updated version)
  
  Reply
Steve Ballmer says

11/18/2020 at 8:42 am

Great work. Thanks.

Reply
Tanzup says

11/22/2020 at 5:32 am

Just to confirm you have it all running on a single Intel NUC right?

Reply
vdrone says

11/28/2020 at 10:51 am

Hi, i used the disables the Reverse Path Filtering script but outside of the haproxy im unable to ping the loadbalancers. My unifiy has intervlan routing and all other vlans work fine. putting a vm in same vlan also does not ping the loadbalancer IP's. Ping response only works when i ssh into the haproy and from the console i can ping the loadbalancers. somehow the loadbalancers are unable to be reaced outside of the haproxy. Any ideas?

Reply
- vdrone says
  
  12/03/2020 at 12:37 am
  
  Fixed it.. USG had a profile issue where uplink to switches missing both vlans...
  
  Reply
Jasmin Manov says

11/29/2020 at 3:01 pm

Hi, is there a price to think about here, We have a vSphere Esseitals plus bundle, is Tanzu licensed separatly, or is it included in the vSphere license?

Reply
Nick says

12/02/2020 at 9:07 am

Is it possible to supply how to deploy Harbor on this type of environment. I've tried but having heck of a time. Think there is a lot of network requirements and not familiar with what would need changed to support.

Was following some steps as outlined here but the deployment pods never fully startup.. https://rguske.github.io/post/vsphere-7-with-kubernetes-supercharged-helm-harbor-tkg/

Reply
- William Lam says
  
  12/02/2020 at 11:02 am
  
  Yes, this is definitely possible. Harbor installation is pretty straight forward BUT when using it with vSphere w/Tanzu via TKG Service or TKG, you need to do something extra to trust the insecure registry UNLESS you're using a proper CA signed certificate
  
  In fact, you can install Harbor onto the Router VM or create another VM, since you may want more storage space. Here's tl;dr commands to install Harbor https://gist.github.com/lamw/7069bc74e020485d0de8c43a0ff8f67f
  
  After that, you can then pull from your registry, but again, depending if you're doing self-sign or proper signed, there will be additional steps which IIRC, are documented in the official vSphere w/Tanzu guides
  
  Reply
Leo says

12/04/2020 at 6:48 am

If not using the RouterVM and an actual DNS server, how would the routing be setup using the same IP's in the examples and scripts?

Reply
Preston says

12/05/2020 at 6:36 pm

Finally got this working, mostly because of user-error on mapping the IP addresses above to ones available on my home network. One other issue I had was that my NUC has 64GB RAM so I skipped step 7 and left the vcsa memory alone. Because of this, I didn't shutdown/restart the vcsa VM and the changes from Step 6 weren't used. Thank you so much!

Reply
hadvik says

12/08/2020 at 3:02 pm

Great guide and scripts!
You mention that you recommend vSAN if you have 64 GB. But how to enable vSAN in this setup?

Reply
- William Lam says
  
  12/08/2020 at 7:35 pm
  
  Please see the official VMware documentation for detailed instructions https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan-planning.doc/GUID-177C1CF9-EB3F-46C2-BE53-670BF864DC25.html
  
  Reply
Bart Agreda says

12/21/2020 at 11:05 pm

Hi, I followed the guide and was able to get my Supervisor Cluster Up. However, even if the IP is pingable, The Tanzu CLI page won't load. Any ideas where I can look?

Reply
- Bart says
  
  12/22/2020 at 3:10 am
  
  Got it working. Re did the wcp initialization. I may have placed an incorrect network setting
  
  Reply
kastro says

01/20/2021 at 10:49 pm

Got it working in one afternoon (with few gotchas because I did some steps manually), THANK YOU.

Reply
tonydimaggio says

02/11/2021 at 7:20 pm

Maybe this is a dumb question, but how are you installing the VCSA on a 250GB M.2? When I do the install it wants 460GB of free disk space in the datastore. Is there a workaround for this?

Reply
- William Lam says
  
  02/11/2021 at 7:26 pm
  
  Enable Thin Prov, it doesn’t actually need or use that amount 🙂
  
  Reply
Nicolas Frey says

02/23/2021 at 4:28 am

Awesome guide and further information. Was already planning for a new one host lab. Now my 8C 64GB 1TB NVME box can still be used 🙂 . Anyway I will need a refresh for NSX-T. But this is the start. Thank you so much. BR Nicolas

Reply
yik says

03/16/2021 at 5:10 am

Hi William, have you ever tested out vsphere 7 u2 on NUC? mine is NUC10i7FNH and it worked find on vsphere 7 u1 but after upgrading to u2, it has cpu usage issues and stayed in error status.

Reply
Sebastian says

05/31/2021 at 7:01 am

Hi William, great post! i helped a lot but I've been stuck in the deployment part as my Supervisor Cluster VMs never setup, the VCSA just deploy them and never gaves them an IP, I've got a DHCP service in my lab in the same network but no clue what happend, any advice?

Reply
jACHIN says

06/21/2021 at 11:53 pm

Hi William, thanks for your share .
There is a question bothers me a lot: what if my desktop is in another subnet, it can ping the RouterVM, but can't ping the Front and Workload ip (10.10.0.1 、10.20.0.1)

Reply
Preston Sheldon says

10/09/2021 at 6:44 am

Any thoughts on modifying this post slightly to install the new Tanzu Community Edition ? Would be a great homelab setup for learning to work with management clusters.

Reply
- William Lam says
  
  10/09/2021 at 8:13 am
  
  You're mixing up between vSphere w/Tanzu and TKG 🙂
  
  TCE is already super optimized for Homelab, simply follow https://tanzucommunityedition.io/docs/latest/getting-started/
  
  Don't have Homelab and just 8GB of memory on your desktop/laptop, use Docker workflow. Want to deploy to vSphere, then you have two choices Standalone Cluster or Management Cluster. You decide on the Cluster size/nodes, recommend you read through TCE docs
  
  Reply
  - Preston says
    
    10/19/2021 at 1:23 pm
    
    My main issue is I don't really understand vSphere installs well enough. TCE Docker installs don't survive restarts so I was hoping to configure vSphere on my NUC (64GB) and then install TCE Management cluster on that.
    
    Reply
    - William Lam says
      
      10/19/2021 at 4:59 pm
      
      If you're not comfortable with vSphere, it is not a good idea to add another technology, which also has a pretty significant learning curve as that'll only make it more difficult and probably frustrating when something doesn't go well. Start with the basics and ensure you can install vSphere (ESXi and vCenter) and you're familiar with the basic foundational concepts before looking at TCE or TKG
      
      Reply
Jeff Lok says

12/02/2021 at 7:17 pm

I can create workload cluster on GUI, and i tried the automation PS script. However, I got this error.. Can you give me some pointer what is wrong:
...
Get-CisService: 12/3/2021 11:10:26 AM Get-CisService One or more errors occurred. (One or more errors occurred. (The reader's MaxDepth of 64 has been exceeded. Path 'result.output.STRUCTURE['com.vmware.vapi.metadata.metamodel.component_data'].info.ST...

Reply
- Rakesh Dodeja says
  
  12/28/2021 at 2:34 pm
  
  user PowerShell version 7.1.5
  
  Reply
andcon says

12/13/2021 at 8:39 am

Issue:
Unable to resolve the host on control plane VM . The hostname ends with the '.local' top level domain, which requires 'local' to be included in the management DNS search domains.

Solution:
Add local to the management DNS search domains.

However I am unable to add .local or local to DNS Search Domain(s) as it requires a FQDN.
Have you encountered this issue when using a domain with .local and how did you fix it?

Reply
- Chris says
  
  05/26/2022 at 8:55 am
  
  Did you ever get this to work? I found a KB - https://kb.vmware.com/s/article/83387 - but still no luck.
  
  Reply
Joshan says

01/23/2022 at 9:31 pm

Hi,

I use the yaml file to create tanzu kubernetes cluster, but api return 'The request is invalid'.

VCSA : 7.0.3.00200
ESXI : 7.0.3, 18644231

YAML
-------------------------------------------------
apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
name: tkgs-cluster-5
namespace: tkc-lab
spec:
distribution:
version: v1.21
topology:
controlPlane:
count: 1
class: best-effort-small
storageClass: tanzu-vsan-storage-policy
workers:
count: 2
class: best-effort-small
storageClass: tanzu-vsan-storage-policy
-------------------------------------------------

Thank you !

Reply
Anteneh Asnake says

09/04/2022 at 7:45 am

Hello Will,

I would like to take for such outstanding Lab, it was very educational and also helped me do lab very effectively. One question, I have is on Step 17, found it had lunching from provided Link with Frontend IP(10.10.0.64). Have found it has to kubectl plugin download, is there a way I can get access to supercluster VM and add routing if need. I really need help on that, thanks.

Reply
Josh Rupp says

09/17/2024 at 10:22 am

Hi William,

So as a long time fan, I wanted to thank you for all of your homelab solutions that you have provided. All of your work is very appreciated.

I know that this is an "older" solution for deploying a Tanzu Homelab with limited resources and NICs, but I can validate that I am able to enable Workload Management with the following:

vCenter: VMware vCenter Server 8.0 Update 2d - Build 23929136
ESXi: VMware ESXi 8.0 Update 2b - Build 23305546
Hardware: Dell Optiplex Micro with 64GB memory & single NIC deployment

I was able to use your New-WorkloadManagement2 module which successfully deployed the Supervisor VM and return a "Host Config Status" of "Running" and get a "Control Plane Node Address" in the deployment!! Sorry, I'm pretty excited about getting this far.

I did have to make a few adjustments since I use pfsense as my firewall and I use VLANs for network segmentation. I deployed a new test VLAN for this deployment with a couple of static routes and firewall policies so I can manage the environment.

If you get a moment, I do have a "Kubernetes Status" warning that states "System pod capv-controller-manager-6447f69f59-jqtw9 in namespace vmware-system-capw is not running." After deployment the CoreDNS pod on the Supervisor went into a CrashLoopBackOff state for a longer period of time than what I was expecting and is now in a "Running" status. The capv-controller-manager pod shows a "Ready" of "0/1" and the "Status" of "ImagePullBackOff".

For troubleshooting I deleted the pod, it goes into a "Status" of "ContainerCreating" before the "Status" changes to "ErrImagePull" and then "ImagePullBackOff". I can always describe the pod if it helps.

If you think this is just an environment specific issue, I can keep digging. I just figured I would ask if you have ever run into that same scenario or heard of others that had.

Well, thank you again for all that you have done for my own personal growth and enjoyment!

Best Regards,
Josh

Reply

Thanks for the comment!Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More from my site

Comments

Thanks for the comment!Cancel reply