For resource constrained environments, deploying VMware Cloud Foundation (VCF) can take longer, especially when deploying on top of a Nested ESXi configuration. However, the VCF Installer does provide a robust retry function that typically will resolve most intermediate issues.
With that said, for environments that are more resource constrained, you may notice the NSX Manager component fails to complete its initialization within the default timeout period. Users can increase the timeout by adding nsxt.manager.wait.minutes to increase the time out (minutes) that VCF Installer / SDDC Manager will wait for NSX to be ready.
echo "nsxt.manager.wait.minutes=180" >> /etc/vmware/vcf/domainmanager/application-prod.properties echo 'y' | /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
Note: SDDC Manager is responsible for deploying NSX, so the setting above should be applied to SDDC Manager. The default behavior of the VCF Installer is to switch to the SDDC Manager function, which means the setting above is actually applied to the VCF Installer unless you are overriding this behavior within the JSON deployment file.
If you need to increase the timeout for the NSX Edge Deployment, users can add edge.node.vm.creation.max.wait.minutes to increase the time out (minutes) that VCF Installer / SDDC Manager will wait for the NSX Edge to be ready.
echo "edge.node.vm.creation.max.wait.minutes=90" >> /etc/vmware/vcf/domainmanager/application-prod.properties echo 'y' | /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
Note: Settings above are applicable for both VCF 5.x and VCF 9.x
In addition, the VCF Installer will also delete a failed resource after a certain number of retries, which can be tricky to debug.
Here are two additional settings that can be useful to both retain the failed component for troubleshooting and increasing the number of retry (default 3):
echo "orchestrator.task.undoOnFailure=true" >> /etc/vmware/vcf/domainmanager/application-prod.properties echo "orchestrator.task.retry.max=5" >> /etc/vmware/vcf/domainmanager/application-prod.properties echo 'y' | /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
This might be what I'm running into when importing a vSphere 8 virtual infrastructure into VCF. Trying to simulate what we might do real world. It builds the 3 nsx managers but watching NSX VIP you can see it trying to add the transport to the 3 virtual ESXi host in the fabric then just fails? The task in VCF Operations seems lacking pointing out exactly what failed or maybe my lack of understanding, but trying.
Thanks William for the tip. I adjusted the timeout per your instructions, I now have successfully imported vSphere 8 into vcf 9 with 3 managers and each host having the transport configured.