A very useful property of automation is the ability to experiment. After creating my vSphere 7 with Kubernetes Automation Lab Deployment Script, I wanted to see what was the minimal footprint in terms of the physical resources but also the underlying components that would be required to allow me to still a fully functional vSphere with Kubernetes environment.
Before diving in, let me give you the usual disclaimer 😉
Disclaimer: This is not officially supported by VMware and you can potentially run into issues if you deviate from the official requirements which the default deployment script adheres to out of the box.
In terms of the physical resources, you will need a system that can provision up to 8 vCPU (this can be further reduced, see Additional Resource Reduction section below), 92GB memory and 1TB of storage (thin provisioned).
which translates to following configuration within the script:
- 1 x Nested ESXi VM with 4 vCPU and 36GB memory
- 1 x VCSA with 2 vCPU and 12GB memory
- 1 x NSX-T Unified Appliance with 4 vCPU and 12GB memory
- 1 x NSX-T Edge with 8 vCPU and 12GB memory
Note: You can probably reduce memory footprint of the ESXi VM further depending on your usage and the VCSA is using the default values for "Tiny", so you can probably trim the memory down a bit more.
Another benefit to this solution is by reducing the number of ESXi VMs required, it also speeds up the deployment and in just 35 minutes, you can have the complete infrastructure fully stood up and configured to try out vSphere with Kubernetes!
The other trick that I leveraged to reduce the amount of resources is by changing the default number of Supervisor Control Plane VMs required for enabling vSphere with Kubernetes. By default, three of these VMs are deployed as part of setting up the Supervisor Cluster, however I found a way to tell the Workload Control Plane (WCP) to only deploy two 🙂
This minimal deployment of vSphere with Kubernetes has already been incorporated into my vSphere with Kubernetes deployment script, but it does require altering several specific settings. You can find the instructions below.
Step 1 - Update the $NestedESXiHostnameToIPs variable in the script so that it only contains a single entry which will tell the script to deploy a single ESXi VM and it will use local VMFS volume.
$NestedESXiHostnameToIPs = @{
"pacific-esxi-4" = "172.17.36.11"
}
Update the vCPU and Memory variables to match the following configuration:
Component | vCPU | vMEM |
---|---|---|
ESXi | 4 | 36 |
VCSA | 2 | 12 |
NSX Manager | 4 | 12 |
NSX Edge | 8 | 32 |
Update the following two variables with value of 0
$configureVSANDiskGroup = 0
$clearVSANHealthCheckAlarm = 0
Step 2 - Run the script like you normally would to deploy the infrastructure, but do not proceed to enable vSphere with Kubernetes, yet. We need to make a modification to VCSA before doing so.
Step 3 - SSH to the deployed VCSA and edit /etc/vmware/wcp/wcpsvc.yaml and update following variables with value of 1 and then save and exit the file.
minmasters 1
maxmasters 1
UPDATE (09/28/21) - As of vSphere 7.0 Update 3, you can now have just a single Supervisor Control Plane VM
Next, we need to restart the WCP service for the change to go into effect:
service-control --restart wcp
Step 4 - You can now enable vSphere with Kubernetes using the vSphere UI like you normally would.
Note: By default, it does not look like there is a check for a minimum of 3 ESXi hosts as you can see from the screenshot above, it is allowing me to proceed. The WCP change above only applies to number of Supervisor Control Plane VMs to deploy and does not affect the number of ESXi hosts required.
Once the deployment has completed, you now have vSphere with Kubernetes running on a single ESXi host with just two Supervisor Control Plane VMs. I have done limited testing but with this reduced configuration, I am able to successfully deploy vSphere PodVMs supporting LoadBalancer Service as well as a Tanzu Kubernetes Grid (TKG) Cluster without any issues. Another variation of this would be to leave the number of Supervisor Control Plane VMs alone and you can actually have all three on a single ESXi host, there are no pre-checks here as well.
Additional Resource Reductions:
If you look at the very first screenshot above which shows the amount of resources required for the script, you will see that a bulk of that is allocated to the NSX-T Edge. A Large NSX-T Edge is recommended which ultimately determines the number of Load Balancers (LB) and maximum configurations that it can support. This is important because when you deploy a K8s application to the Supervisor Cluster, vSphere with Kubernetes will automatically deploy a Medium size LB that will be used for your application. If you are not using a Large NSX-T Edge, you may not be able to deploy additional applications and/or deploy a TKG Cluster.
In my opinion, being able to use a smaller configuration for demo/POC purposes makes sense and today there is a significant jump between the resources for a Medium and Large NSX-T Edge. The instructions below will show how you can re-size the LB that is provisioned by vSphere with Kubernetes. I have only done limiting testing including deploying a vSphere PodVM application as well as a 3-Node TKG Cluster, so there your mileage and experience may vary. It is NOT recommended that you make NSX-T configuration changes behind vSphere with Kubernetes which is protected by default, but if you need to deploy a small setup or unable to provision VM with 8 vCPU (which I know several customers have mentioned), then this is a hack that could be considered.
The instructions above are still required, but in Step 1 above, instead of configuring the NSX-T Edge to have 8 vCPU and 32GB memory (Large), we will change that to 4 vCPU and 8GB memory (Medium) and you now the overall amount of required memory without changing the Nested ESXi VM and VCSA is now 68GB! As I said, you can probably tune it down further if required.
After you have deployed a vSphere K8s application, a Medium LB will be provisioned in NSX-T. You can see this by logging into NSX-T Manager and under Load Balancing->Load Balancers, you should see both a Distributed Load Balancer (DLB) used for Supervisor Cluster namespaces and regular LB. We will re-size this LB from Medium to Small using the instructions below.
UPDATE (01/08/21) - As of NSX-T 3.1, there are some additional changes required to reduce the size of the LBs. Please see https://www.vrealize.it/2021/01/08/vsphere-with-tanzu-with-nsx-t-medium-sized-edge/ for the additional instructions.
Step 1 - We will use cURL to perform the necessary API requests as the LB is a protected object as it was created by vSphere with Kubernetes. You can use any REST Client including Postman and/or PowerShell. For this example, I am just running the cURL command from within the VCSA. The first step is to list out the LBs which you can do so with the following command and replacing the credentials of your NSX-T Manager as well as the FQDN.
curl -k -u 'admin:VMware1!VMware1!' -X GET 'https://pacific-nsx-2.cpbu.corp/policy/api/v1/infra/lb-services'
Step 2 - You will look for the ID of the Medium LB which you can see from the size property. Once you have that identifier (e.g. domain-c....) go ahead and perform the additional GET so we can retrieve its current configuration.
curl -k -u 'admin:VMware1!VMware1!' -X GET 'https://pacific-nsx-2.cpbu.corp/policy/api/v1/infra/lb-services/domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff_0_ennif'
You will take that output and save it into a file called resize-edge (or any other name of your choosing) and change the value of size from MEDIUM to SMALL as shown in the output below.
{ "connectivity_path" : "/infra/tier-1s/domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff", "enabled" : true, "relax_scale_validation" : true, "size" : "SMALL", "error_log_level" : "INFO", "resource_type" : "LBService", "id" : "domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff_0_ennif", "display_name" : "domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff-0", "tags" : [ { "scope" : "ncp/version", "tag" : "1.2.0" }, { "scope" : "ncp/cluster", "tag" : "domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff" }, { "scope" : "external_id", "tag" : "9ac2ac47-d7af-54da-bf6c-85d6863b1ca4" }, { "scope" : "ncp/created_for", "tag" : "SLB" }, { "scope" : "ncp/lb_t1_link_ip", "tag" : "100.64.224.1" } ], "path" : "/infra/lb-services/domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff_0_ennif", "relative_path" : "domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff_0_ennif", "parent_path" : "/infra", "unique_id" : "60b242eb-8c06-4fb6-9f52-c54fa09581a9", "marked_for_delete" : false, "overridden" : false, "_create_user" : "wcp-cluster-user-domain-c8-94e052d3-1d77-4248-a889-22c6f27f27ab", "_create_time" : 1588030492277, "_last_modified_user" : "wcp-cluster-user-domain-c8-94e052d3-1d77-4248-a889-22c6f27f27ab", "_last_modified_time" : 1588030492296, "_system_owned" : false, "_protection" : "REQUIRE_OVERRIDE", "_revision" : 0 }
Step 3 - Now we need to reconfigure the LB by performing a PATCH operation and specifying our LB ID along with the payload of the resize-edge file as shown in the command below.
curl -k -u 'admin:VMware1!VMware1!' -H "Content-Type: application/json" --data @resize-edge -X PATCH 'https://pacific-nsx-2.cpbu.corp/policy/api/v1/infra/lb-services/domain-c8:a6d0e1cc-8035-4391-ad37-7348bc45efff_0_ennif' -H "X-Allow-Overwrite: true"
If the operation was successfully performed, you should see that the status changes in the NSX-T UI as it reconfigures the LB from Medium to Small. With the resources of a Medium NSX-T Edge, you can have up to 10 Small LBs and 1 Medium LB.
Brent says
Thanks William for your efforts on this for us. I'm looking to enable Kubernetes in vSphere 7.0 in my physical homelab, do you have a sense of when VMUG will make available the vSphere Enterprise Plus with Add-on for Kubernetes license? Or maybe it's already available and I am missing on how to redeem it? I have vSphere 7.0 and NSX-T already deployed. Thanks again for everything, always enjoy your posts.
William Lam says
Sorry, I don't know when they'll have more details. VMUG did say that they'll have official communication (probably over email) when it is available. With that said, you can play with vSphere with Kubernetes with just vSphere 7 and NSX-T licenses. The VCF License is if you plan to use VCF and its tools to deploy. Today, I'm using vSphere 7 (VCSA and ESXi) in Eval Mode and as long as you've got an NSX-T license, that should work which is how I'm testing and building vSphere with K8s via my script (which is reference in the requirements of NSX-T 3.0 license)
Viktor vanden Berg says
Great post William! I was actually playing around with a minimal configuration as well. Most of the issues I faced were around the EdgeVM and load balancer(s) not being deployed. Another thing I noticed is that my "physical" ESXi host (part of a single host cluster) is tagged incompatible in the Enable Workload Management at first. After I've created a second cluster with two nested ESXi hosts, both cluster01 en cluster02 show up as compatible clusters to enable workload management. Do you have any thoughts what is going on here?
Daryl D Claiborne says
Hello All, Can you any point me in the right direction. I'm unable to get the script to execute. Getting the following error listed below.
PS C:\Users\mrcla\Desktop\Project-Pacific> .\vghetto-vsphere-with-kubernetes-external-nsxt-lab-deployment.ps1
Unable to find C:\Users\mrcla\Desktop\Project-Pacific\vghetto-vsphere-with-kubernetes-external-nsxt-lab-deployment ...
Dimka says
With Consolidated Architecture model (https://docs.vmware.com/en/VMware-Cloud-Foundation/3.0/com.vmware.vcf.ovdeploy.doc_30/GUID-61453C12-3BB8-4C2A-A895-A1A805931BB2.html) can we run everything on the physical esxi host, or do we still need a nested esxi?
na.3 says
hi Wiliam,
can we tunes this value somewhere on a yalm file?
https://1fichier.com/?6s1chtim69rv4blqdf7x
regards.
Nicholas says
Any chance a minimal install could work on a NUC Skull Canyon with 32GB memory?
Tanaya Umbrani says
Hi William, I executed the script on VC 70 instead of 701, it did create the nested ESXi's ,HAproxy and VCSA, they all seems to b up and running but the script showed many errors, which I am not able to figure out. Can you confirm if its due to VC at 70 version instead of 701?
ThomasD says
Does this script work with the limited export version of nsx t? I'm having a few problems getting it to work, but am not sure what the cause is. Anyone got it running with the nsx t 3.1 limited export version? (or know where to find 3.0 full version OVAs that can be eval'd