VMworld Barcelona is just around the corner and this week I started working on building out the different demo environments which will all be running on VMware Cloud on AWS. In one of the demos, I need to have ESX 3.0 running, yes you read that correctly! ESX as in the original version with the Service Console (cos), some of you maybe too young to remember these good ol days? 😉
First, Let me be clear, there really is no good reason for this except for the nostalgia purposes and for what I am trying to demonstrate in our VMworld session. If you are curious about the demo and attend VMworld, be sure to sign up for HBI1967BE Workload Migration Techniques for On-Premises and Cloud Infrastructures which I will be co-presenting with Emad Younis. Secondly, Nested Virtualization whether it is the latest version of ESXi or our very first release, is not officially supported.
While attempting to boot the ESX 3.0 (Build 312855) installer, I found that it simply kept crashing with the following error message:
WARNING: VMK: 538: Initialization of vmkernel failed, status 0xbad0001
Honestly, this did not really surprise me given this was a 13 year old OS which is no longer supported and we do not actively test older ESX releases as a guestOS. In fact, a 25 year old OS like MS-DOS is more likely to run without issues than an older ESX release and interestingly enough, it still does like a champ even on VMware Cloud on AWS.
I knew this would be a long shot but reached out to Engineering to see if there was a solution. I was actually pleasantly surprised that we found a workaround after understanding the issue as the error was fairly generic. It turns out that in earlier releases of ESX, we derived the available colors from the L3 cache info. Hardware platforms have changed quite significantly since then and there were many assumptions that were made about a CPU in the early days that simply no longer applies. While debugging the issue, we found the value generated from the L3 cache was not a power of two which the system expected and hence caused the error. It turns out we could override this behavior by simply using the L2 cache instead which did give us a power of two value.
The workaround was to simply add the following VM advanced setting to the Nested ESX VM:
cpuid.maxCacheLevel=2
After the change was applied, the ESX Installer booted without any issues and I could install ESX 3.0 in a VM running on VMware Cloud on AWS!
Wow nothing says Virtually Ghetto like this experiment lol! Really awesome!
For the less skilled architects, would HW version have been a suitable workaround? I know in practice that ESX 6.7 would support HW versions only back to a certain level, but could we have just tweaked the hw version to get the same effect ( see https://kb.vmware.com/s/article/1003746 ) ?
Greg,
No, the info is coming directly from the CPU itself and vHW is simply passing that through. I'm actually using vHW7 for ESX 3.x VM