After deploying a new VMware Cloud Foundation (VCF) Workload Domain using the VCF Holodeck Toolkit, which leverages Nested ESXi, I noticed the vSphere Cluster Services (vCLS) VMs kept failing to power on and threw the following error message:
No host is compatible with the virtual machine
I thought this was quite strange, especially since the vCLS VMs ran fine when the VCF Management Domain was setup.
UPDATE (07/03/2024) - The reason for the vCLS error is actually due to the miss-configuration of the Nested ESXi VM created by VCF Holodeck Toolkit, please see this blog post for an easier fix.
Looking at the vmware.log for the vCLS VM, I quickly found the issue where the VM expects to have the MWAIT CPU instruction exposed:
2024-03-19T16:35:35.736Z In(05)+ vmx - Power on failure messages: Feature 'cpuid.mwait' was 0, but must be 0x1.
2024-03-19T16:35:35.736Z In(05)+ vmx - Module 'FeatureCompatLate' power on failed.
2024-03-19T16:35:35.736Z In(05)+ vmx - Failed to start the virtual machine.
I figure I was probably not the first person to run into this and asked Ben Sier, who works on Holodeck and indeed he ran into this before. It looks like with newer vSphere releases, it expects to configure Per-VM EVC but the vCLS VM may not function properly within a Nested ESXI environment. Luckily, Ben has a workaround that we can quickly use.
Step 1 - We first need to upgrade the ESXi VM Compatibility (vHW) of the vCLS VMs to the latest version (should be v14 if you are using VCF 5.1 or 8.0 Update 2). While you can adjust the default permissions for the administrator[at]vsphere[dot]local account, I decided to take a simpler approach by just going directly to the ESXi host which is managing the vCLS VM. Right click on the VM and then click on Upgrade VM Compatibility.
If you wanted to automate this, here is a quick PowerCLI snippet that can be used:
Connect-VIServer -Server esxi-5.vcf.sddc.lab -User root -Password VMware123! $vm = Get-VM "vCLS-71c8890f-af95-4e16-89d7-6761859b04a2" $vm | Set-VM -Version v14 -Confirm:$false Disconnect-VIServer * -Confirm:$false
Step 2 - Finally, log back into your vCenter Server and then enable the Per-VM EVC configuration but set it to disabled. In a few seconds, you should notice that the vCLS VM can now be successfully powered on and will enable Per-VM EVC automatically.
If you wanted to automate this, here is a quick PowerCLI snippet that can be used:
Connect-VIServer -Server esxi-5.vcf.sddc.lab -User root -Password VMware123! $vm = Get-VM "vCLS-71c8890f-af95-4e16-89d7-6761859b04a2" $vm.ExtensionData.ApplyEvcModeVM_Task($null,$true) Disconnect-VIServer * -Confirm:$false
medwardsea7fe2717f says
Great article as always William. Just a heads up, it looks like you may have inadvertently posted the same code snippet twice 😉
William Lam says
Doh! Its fixed
Rodney Barnhardt says
Thank you for posting this. I was building out a lab using the Holodeck 2.0 5.1.1 build and ran into this exact issue. I did have to run the commands on each one of the vCLS's as they were created, but it did correct the issue.
SwissTiger says
Thanks William, this worked great on ESXI 8 but after upgrade to 8.0U3 24022510 which is IA it seems it doesnt work anymore. The vCLS are at compatibility version 8.0U2 when generated and use Photon CRX OS instead of "other". I run a nested lab in Proxmox and it worked till I upgraded to latest 8.0U3, the vCLS are generated with EVC on in Intel "Merom" compatibility, cant turn it off. After trying to loadbalance the Cluster VMs the Cluster gets degraded after 3 mins. Any tests from your end with latest nested ESXi?
William Lam says
Please see https://williamlam.com/2024/07/incorrect-guestos-type-for-nested-esxi-causes-vcls-issues-with-vmware-cloud-foundation-vcf-holodeck-toolkit.html for actual reason why this happens and a better solution. In vSphere 8.0 Update 3, vCLS has been re-architected and no longer uses VMs but rather PodVMs (CRX containers), so not sure what impact that might have