After upgrading my homelab to vSphere 7.0 Update 3, I noticed that my VMware Event Broker Appliance (VEBA) vSphere UI Plugin which is included as part of the VEBA Appliance was no longer functioning properly and would display no healthy upstream error message.
I initially thought this might be environmental, since I had just upgraded from lab from vSphere 7.0 Update 2d to 7.0 Update 3. I had reported the issue to our vSphere UI Engineers who built the VEBA UI plugin and while they were looking into the issue, we had another report from a VEBA user who also was hitting the same issue. Today, I got an update from Engineering and it looks like there was a regression in the Envoy service running in the VCSA which had caused this issue. This issue will be fixed in a future patch update for the VCSA, but in the mean time, VEBA users can apply the workaround outlined below.
Note: This workaround is required for vSphere 7.0 Update 3 or later. The issue has been fixed in vSphere 8.0 but if you are running any version of 7.0 Update 3 or newer, you will still need to apply this workaround.
Step 1 - SSH to the VCSA Appliance and backup the original ProxyConfiguration.json file by running the following command:
cp /etc/vmware-rhttpproxy/endpoints.conf.d/ProxyConfiguration.json /etc/vmware-rhttpproxy/endpoints.conf.d/ProxyConfiguration.json.bak
Step 2 - We first need to unminify the proxy configuration file so that it is readable by a human since the JSON content is all compressed into a single line. Luckily, we have the jq utility on the VCSA and we will make a copy of that into a file called ProxyConfiguration-unminify.json by running the following command:
cat /etc/vmware-rhttpproxy/endpoints.conf.d/ProxyConfiguration.json | jq . > ProxyConfiguration-unminify.json
Step 3 - Open the file using VI Editor and search for the string "veba" and locate the section that looks like the following:
{ "name": "remote-plugin-cluster-com.vmware.veba-0.2.0.0-1811688644-veba.primp-industries.local-443", "type": "STRICT_DNS", "connect_timeout": 120, "lb_policy": "VMWARE_FIFO", "endpoints": [{ "socket_address": { "address": "veba.primp-industries.local", "port_value": 443 } }], "tls_context": { "common_tls_context": { "tls_params": { "tls_minimum_protocol_version": "TLSv1_2", "tls_maximum_protocol_version": "TLSv1_2", "cipher_suites": "ECDHE+AESGCM:RSA+AESGCM:ECDHE+AES:RSA+AES", "ecdh_curves": "prime256v1:secp384r1:secp521r1" }, "validation_context": { "verify_certificate_hash": [ "xx:xx:xx:xx" ] } } }, "common_http_protocol_options": { "idle_timeout": 28800 }, "http_protocol_options": { "allow_absolute_url": true, "accept_http_10": true, "default_host_for_http_10": "localhost" } }
We need to add an entry after the "tls_context" that contains the hostname of the VEBA appliance like the following:
, "sni":"veba.primp-industries.local"
Here is what the final change should look like, please note there is a comma that is added before "sni" key
{ "name": "remote-plugin-cluster-com.vmware.veba-0.2.0.0-1811688644-veba.primp-industries.local-443", "type": "STRICT_DNS", "connect_timeout": 120, "lb_policy": "VMWARE_FIFO", "endpoints": [ { "socket_address": { "address": "veba.primp-industries.local", "port_value": 443 } } ], "tls_context": { "common_tls_context": { "tls_params": { "tls_minimum_protocol_version": "TLSv1_2", "tls_maximum_protocol_version": "TLSv1_2", "cipher_suites": "ECDHE+AESGCM:RSA+AESGCM:ECDHE+AES:RSA+AES", "ecdh_curves": "prime256v1:secp384r1:secp521r1" }, "validation_context": { "verify_certificate_hash": [ "xx:xx:xx:xx" ] } }, "sni":"veba.primp-industries.local" }
Save your changes and exit the file.
Step 4 - We are now going to re-minify the JSON content and replace the original proxy configuration file by running the following command:
cat ProxyConfiguration-unminify.json | jq -c . > /etc/vmware-rhttpproxy/endpoints.conf.d/ProxyConfiguration.json
Step 5 - We now need to restart the following two services for the changes to go into effect:
vmon-cli -r rhttpproxy && vmon-cli -r envoy
Step 6 - Lastly, we need to SSH to VEBA Appliance and restart the VEBA UI pod by running the following command:
kubectl -n vmware-system delete pod $(kubectl -n vmware-system get pods | grep veba-ui | awk '{print $1}')
This will take a minute or so to restart and once the command prompt is returned, you can monitor the VEBA UI logs by running the following:
kubectl -n vmware-system logs $(kubectl -n vmware-system get pods | grep veba-ui | awk '{print $1}') -f
You should eventually see the following lines which indicates the VEBA UI plugin has successfully been deployed
2021-10-20 14:58:13.998 INFO 1 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup
2021-10-20 14:58:14.054 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path '/veba-ui'
2021-10-20 14:58:14.059 INFO 1 --- [ main] c.v.sample.remote.SpringBootApplication : Started SpringBootApplication in 18.852 seconds (JVM running for 19.62)
At this point, you can now login to your vSphere UI or refresh the UI if you have not already and you should now be able to access the VEBA UI plugin!
Thanks for the comment!