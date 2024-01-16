After debugging a recent issue with using VMware Workstation and Intel Hybrid CPUs, it gave me an idea about an experiment to try with ESXi and Intel Hybrid CPUs.

As a refresher, starting with the Intel 12th Generation (Alder Lake) CPU, a new hybrid big.LITTLE CPU architecture was introduced for consumer Intel CPUs. This new hybrid Intel CPU architecture integrates two types of CPU cores: Performance-cores (P-cores) and Efficiency-cores (E-cores) into the same physical CPU die. For more information about this new hybrid Intel CPU design, check out this resource HERE. The ESXi scheduler does not and has no current plans to support this new Intel Hybrid CPU architecture, especially as this type of architecture is nowhere to be found in traditional Enterprise datacenters and is only limited to Intel Consumer CPUs.

The current recommendation to work around the non-uniformity of the CPU cores is to either disable the E or P-cores within the system BIOS, thus making the system "uniform" and allowing ESXi to run like a normal x86 system. While you can apply a workaround to have ESXi ignore the non-uniformity of the CPU cores, in addition to the non-deterministic behaviors, random PSOD can also occur due to scheduling across two different types of cores.

I was curious to see whether applying ESXi CPU affinity on a VM using Intel Hybrid CPU Cores might yield a different outcome?

I first wanted to see if I could identify which CPU cores were P-cores versus E-cores with ESXi. For my experiment, I used the same Intel NUC 13 Pro which I had used for the VMware Workstation debugging, which has an an Intel i7 1360P (4 x P-Cores and 8 x E-Cores).

The observed behavior with VMware Workstation was that all P-cores (including hyperthreading) came came first, then followed by E-cores.

Cores [ 0, 1, 2, 3, 4, 5, 6, 7] are all P-Cores (includes HT cores)

Cores [8, 9, 10, 12, 13, 14, 15, 16] are all E-Cores

Unlike VMware Workstation, when ESXi observes non-uniform CPU cores, HT is automatically disabled by ESXi and thus we do not receive 2 x the P-Cores. To confirm whether ESXi has the same P and E-Core ordering behavior as VMware Workstation, I performed a simple test by iterating through each core and assigning it to a Windows VM to benchmark the performance using the popular CPU-Z utility. From this basic test, I was able to conclude that P-Cores were indeed ordered first followed by the E-Cores.

Cores [ 0, 1, 2, 3] are all P-Cores (no HT cores)

Cores [4, 5, 6, 7, 8, 9, 10, 11] are all E-Cores

Using this information, we can now create VMs and that are affinitized to either P-Cores or E-Cores to ensure consistent performance and hopefully avoid any inconsistent behaviors when schedule across different types of cores. If you are using a standalone non-managed ESXi host (e.g. no vCenter Server), you configure CPU affinity for a VM by using the ESXi Host Client and expanding the CPU configuration section.

Here is a VM configured with 2 x P-Cores which I have affinitized to Core 0 & 1



Here is a VM configured with 2 x E-Cores which I have affinitized to Core 4 & 5



If you have vCenter Server, it looks like CPU affinity was removed from the vSphere UI at some point and the only way to apply CPU affinity is by using the vSphere API. Below is a quick PowerCLI snippet for applying CPU affinity for a specific VM and this might even be better as it will allow you to easily apply the required affinity versus using the UI.

$vm = Get-VM "Win10-PCore-0-1" $affinitySpec = New-Object VMware.Vim.VirtualMachineAffinityInfo $affinitySpec.AffinitySet = @(0,1) $spec = New-Object VMware.Vim.VirtualMachineConfigSpec $spec.cpuAffinity = $affinitySpec $task = $vm.ExtensionData.ReconfigVM_Task($spec) $task1 = Get-Task -Id ("Task-$($task.value)") $task1 | Wait-Task

As expected, we can see that the VM configured with 2 x P-Cores outperforms the VM configured with 2 x E-Cores.



For those looking to squeeze the most out of their hardware investments when using the new Intel Hybrid CPU Cores, there is at least an option to get consistent performance at the cost of manual CPU core assignment which could yield some CPU inefficiencies depending on how demanding your workloads are. I am curious to hear from the community on whether this is actually a feasible option for real world workloads since this was a pretty basic experiment and YMMV.