One thing I love about the VMware Community is the constant sharing of knowledge and information on a regular basis. I always enjoy discovering new tricks and tidbits from the community, especially as it helps me refine my own knowledge and understanding of a given technology or solution.
My good buddy Ariel Sanchez cc'ed me on Twitter yesterday referencing a blog post by Paul Wilk about an issue he was observing in his Nested ESXi environment when configuring vSphere with Tanzu.
This is interesting! Wonder if @lamw ir @eric_shanks have ever seen something like it
— Ariel Sanchez Mora @*protected email* (@arielsanchezmor) November 15, 2020
This was in regards to the dreaded 404 message displayed in the vSphere UI:
HTTP communication could not be completed with status 404
which is actually not unique to a Nested environment. In fact, this cryptic error message was observed even in the first release of vSphere with Tanzu which used to be called vSphere with Kubernetes with the release of vSphere 7.0 release.
Although Paul's conclusion on why his fixed work was not exactly correct, it was the fix itself that I was actually most interested in. Even with the initial vSphere 7.0 release, I had assumed this was just a cosmetic vCenter Server error message. It was not ideal, but like many other customers, I just ignored it as the enablement of Workload Management was still successful.
What helped me connect the dots was the fact that Paul solved the problem by disabling the ESXi firewall, which meant this was actually an ESXi issue. Given this was related to the OVF deployment, I immediately knew what this was actually referring to and is related to an earlier blog post I had shared about a new feature that would allow ESXi to "pull" remote OVF/OVA files from a HTTP(s) endpoint. In this case, it was not OVFTool driving the deployment but rather vCenter Server and the Content Library service, which is also responsible for OVF/OVA deployments.
It turns out that as part of deploying the Supervisor VMs, instead of using the typical "push" method for uploading an OVA, vCenter is instructing the ESXi host to "pull" the OVA files remotely which are actually hosted on the vCenter Server Appliance (VCSA) itself. What ends up happening is that because ESXi does not have the correct port in which the OVA is hosted on the VCSA, the "pull" method fails and it automatically falls back to the old "push" method. This is why you see the error message and then progress is immediately progressing.