The ESXi MAC Learn dvFilter Fling was released a little over two years ago and it has become a must have when it comes to running our ESXi Hypervisor within a VM, also referred to as Nested ESXi. The reason this Fling has become such a popular hit amongst our customers and partners is that it greatly improves the performance when “Promiscuous Mode” is enabled on a Virtual or Distributed Virtual Portgroup, which is a requirement for using Nested ESXi. Although this Fling works great, there are a couple of limitations with this solution today. The first of which is called out in the original Fling release notes, that once a MAC Address has been learned, it never ages out which is not ideal for long running Nested ESXi environments that generates a large amount of new MAC Addresses. The second, is the lack of vMotion support where the learned MAC Address table is not transfered to the destination ESXi host and must be re-learned.
To help address both of these limitations, the folks over in the Network and Security Business Unit (NSBU) have been working hard to improve upon the existing solution and have developed a new native MAC Learning VMkernel module called the Learnswitch. This new Learnswitch not only helps improves Nested ESXi workloads but it can also potentially benefit other workloads such as Nested Containers or other 3rd Party network inspection software. One immediate difference from the previous MAC Learn dvFilter solution is that rather than operating on the Network IO Chain, the filtering is now performed within the outer virtual switch layer itself which will provide some additional performance gains. The other added benefit from an internal VMware standpoint is that the Learnswitch is now vmkapi compatible, which means we will have a better backwards compatible story for supporting old releases of ESXi. One downside to this new solution compared to the previous one is that because the dvFilter operated below the virtual switch layer, it could support both a Virtual Standard Switch as well as the Distributed Virtual Switch. With the new Learnswitch, a Distributed Virtual Switch will be required. If you currently do not meet the requirements of the new Learnswitch, you can continue using the dvFilter, but it is recommended that you do not mix both on a single system but you can definitely make use of both solutions across different ESXi hosts depending on the constraints of your environment.
Here are some of the new capabilities provided by the new Learnswitch module:
- Overlay Network based that learning and filtering are done in Etherswitch forwarding check
- MAC Address learning is based on VLAN ID or VXLAN ID on uplink and leaf port
- Packet is filtered on uplink and leaf port if the MAC is learned on a different port
- MAC Address table size is 32k per system
- MAC Address aging support with default aging time of 5 minutes and configurable
- Unknown unicast packet is flooded by default and configurable to drop
- vMotion support that the MAC table learned on the port is transferred to destination host and RARP packet is sent
- Standalone VMkernel module available as a VIB
- net-learnswitch CLI to display MAC Address table, configuration and stats
- Either a vSphere 6.5p01+ or vSphere 6.0 environment
- ESXi host configured with a Distributed Virtual Switch (VDS)
- Both Promiscuous Mode and Forged Transmit is still required on the outer VDS or Distributed Portgroup (applicable only for Nested ESXi use cases)
- System with Python running to configure the Learnswitch. (Make sure you have both python-six & python-yaml packages installed as this is needed by the script)
Step 1 - Download the ESXi-Learnswitch.zip package and extract its contents onto your desktop. You will find that it contains the following four files:
Step 2 - Copy either the VMware-ESXi-6.5.0-5161263-learnswitch.zip for an ESXi 6.5 host or VMware-ESXi-6.0.0-5223106-learnswitch.zip for an ESXi 6.0 host. To install the VIB, run the following ESXCLI command:
esxcli software vib install -d /VMware-ESXi-6.5.0-5161263-learnswitch.zip
Note: If you installed the VIB on an ESXi 6.0 system and you plan to upgrade to ESXi 6.5, make sure you uninstall the VIB before installing the 6.5 VIB.
Step 3 - Reboot the ESXi host for the changes to go into effect.
Step 4 - Extract the VMware-pyVpx-6.5.0-4602587.zip onto a system that has Python running.
Step 5 - Move the learnswitch_cfg.py into the pyVpx directory that was created from the previous step and then change into pyVpx directory.
Step 6 - Finally, we just need to enable the Learnswitch on the Distributed Portgroup(s) that we plan to use for our Nested ESXi workloads. To do so, we need to first edit the learnswitch_cfg.py and update it with our vCenter Server credentials along with specifying the list of Distributed Portgroup(s) we want enabled. Look for the following section shown below and update it with your own environment configuration.
Here is an example of what this looks like for my environment:
## CONFIG ## vc_user = "*protected email*" vc_password = "VMware1!" dvpg_name_list = [ 'DVPG-Nested-ESXi-Workload-1', 'DVPG-Nested-ESXi-Workload-2' ]]
In my environment, I have the following configured:
Once you have saved your changes. Run the script with the "add" option and specify the Hostname/IP Address of your vCenter Server, the name of the Distributed Virtual Switch and the IP Address of your ESXi host (do not use hostname).
python learnswitch_cfg 192.168.1.200 VDS 192.168.1.100 add
Note: If you have more than one ESXi host, you will need to run this script for each of the ESXi hosts.
At this point, you have now successfully installed and configured the new Learnswitch module. You can start deploying and running your Nested ESXi workload just as you did before but now rather than having to configure individual vNICs on your Nested ESXi VM to benefit from MAC Learning, you simply just place your Nested ESXi VMs on the Distributed Virtual Portgroups that have MAC Learning enabled, pretty easy right!?
If you want to disable the MAC Learn functionality on particular set of Distributed Virtual Portgroup(s), you just need to specify the "remove" option in the script by running the following:
python learnswitch_cfg 192.168.1.200 VDS 192.168.1.100 remove
If you wish to completely remove the Learnswitch module, after disabling the functionality on the Distributed Portgroup(s), you just need to uninstall the VIB and reboot the ESXi host. To do so, run the following ESXCLI command:
esxcli software vib remove -n esx-learnswitch
net-learnswitch CLI Examples
In addition to adding the Learnswitch VMkernel module when installing the VIB, it also includes a really handy net-learnswitch command-line utility.
If you have a VM provisioned onto the Distributed Portgroup(s) which has the Learnswitch enabled, you can run the following command and specify the name of your VDS to list more details:
net-learnswitch --instance VDS-6.5 --list
You can also retrieve statistics for either the entire VDS instance or even filter on individual Distributed Portgroup(s) by using the following command:
net-learnswitch --instance VDS --stats
Another useful command is to dump out the entire MAC Address table and this is where you could identify aged MAC Addresses(s) that should be removed.
net-learnswitch --instance VDS --mac-address-table
For a complete list of options with the net-learnswitch CLI, you can specify the -h command.
Lastly, I would like to give a big shoutout to Subin Mathew who has been the lead developer behind the Learnswitch. Thanks for all the awesome work you have done to help further improve running Nested ESXi, even if it is still not "officially" supported :D. Also, a huge thanks to Christian Dickmann who initially started this effort with the MAC Learn dvFilter, our customers truly appreciate it as do all of us who run Nested ESXi for lab and educational purposes.
Dan Dunckel says
Awesome stuff. We hope to use this in the near future.
Small typo: `net-learnswitch --instasnce VDS --stats` (should be `instance`). Should help the copy/pasters!
William Lam says
Fixed, thanks for the catch
If you run NSX on the top level ESXi would checking the option Enable MAC learning achieve the same outcome for the dvportgroup?
William Lam says
No, that feature actually does something else and is not related in any way to this. If you're running Nested ESXi, this is something you should consider implementing in your environment
Hi William, I am trying to setup in my environment, but each time that I execute the python script y receive the following:
File "learnswitch_cfg.py", line 344, in
File "learnswitch_cfg.py", line 313, in main
content = get_vc_content()
File "learnswitch_cfg.py", line 244, in get_vc_content
ssl._create_default_https_context = ssl._create_unverified_context
AttributeError: 'module' object has no attribute '_create_unverified_context'
What I need to check in order to fix it?
Chirag Radhakrishnan says
You need to move to Python 2.7.9 or above.
Hussam Sawaqed (@HussamSawaqed) says
Can you please explain more how we can enable the python-six and yaml packages so the script can work.
PS: I'm new to python
Thanks in advance.
Kane Charles says
How would we go about using this with stateless hosts, ie Auto Deployed? I assume this configuration needs to stick to the ESXi host, and having it stateless would not make that possible...
I too am experiencing problems with the script as I don't know Python, and I'm using Windows. Is there a PowerCLI version of the script or a tutorual on how ot install/setup the two additional pieces (six, yaml)?
Or, is there some explanation of what the script does in generic terms so that I could write something myself? There was enough detail in the Learn dvFilter to be able write something myself.
Having worked on this some more, mostly by resolving dependencies, I'm stuck here:
J:\Data\Technology_Vendors\VMware\Flings\ESXi_Learnswitch\ESXi-Learnswitch-v1.0.1>python vSphere6Lab-learnswitch_cfg.ver02.py DvSwitch_for_TOPELHhost01 add
Traceback (most recent call last):
File "vSphere6Lab-learnswitch_cfg.ver02.py", line 30, in
from pyVmomi import Vim, Vmodl, SoapStubAdapter, VmomiSupport
ImportError: cannot import name Vim
Yet, I do have pyvomi installed:
Microsoft Windows [Version 10.0.14393]
Please unzip VMware-pyVpx-6.5.0-4602587.zip and put the learnswitch_cfg.py file inside newly created pyVpx directory and run from there. That worked for me. I think shall be explicitly written in the description above.
Dan Dunckel says
Are "aged" mac addresses automatically removed?
Dan Dunckel says
Also, when a portgroup is deleted are the corresponding mac address entries removed?
Jonathan Brown says
I am attempting to do something similar to nesting an ESXi hypervisor inside of an existing ESXi host (v6.0) but with a slight twist. Rather than nesting an ESXi hypervisor, I'm attempting to use a CentOS 7 VM running OpenVSwitch and attach the OpenVSwitch bridge to a portgroup on the VDS. The OpenVSwitch is then intended to provide a VXLAN tunnel out of the ESXi virtual environment to a remote KVM server (also using OpenVSwitch) that provides the distant end of the VXLAN tunnel. The KVM server also has two VMs running on it. The intention is for the VMs on the KVM server to be able to talk to any of the VMs attached to the same portgroup VLAN ID inside the ESXi environment as if they were collocated locally. I hope that's not too confusing. If you're interested I've actually drawn up a layout of what were doing here: https://imgur.com/a/BpkPC
I'm running into the issues that you'd described in some of your earlier posts about using the promiscuous mode setting on the portgroup and seeing duplication of packets being sent to all VMs. I'm thinking that the Learnswitch might be the answer to my problems but thought I'd run it by you before I went down that path.
Anyway, I appreciate any help you can provide.
I belive it has a bug when PG is trunk. It doesn't learn mac address from uplink port.
PG toward esxi is trunk / nested VM send tagged frame / I see on upstream physical switch mac address
on remote computer I see arp entry. So Nested VM tagged packet move out but on ingress on uplink switch doesn't learn macs'.
I set PG to VLAN and I see mac table start populating. ( Since VLAN I set is the same it creates correct entry for uplinks)
net-learnswitch --instance Flood --mac-address-table
f0:9f:c2:0f:4e:b4 Uplink 30 0 83886082 20
f0:9f:c2:0f:4e:b4 Uplink 10 0 83886082 26
00:1b:21:bc:5d:1a Uplink 30 0 83886082 10
I switch PG back to a trunk since nested still tagg with 30/10 and switch has right table it works.
i'm trying to understand the use cases of MAC LEARNING option when implementing a LS on NSX.
And when you say : MAC Address learning is based on VLAN ID or VXLAN ID on uplink and leaf port. what VLAN are we talkng about ?
Ravi Kumar says
I have written a blog on how to install Python 2.7.12 for new users so that they can run the ESXi Learnswitch without issues. I am posting it here cause I see a lot of users are facing the same issues I faced. Hope this helps.
Is it possible to manually enter the code via CLI instead of installing the VIB or is the coding too long/cumbersome to manually enter?
William Lam says
No, that's just how this works. If you're on vSphere 6.7, Native MAC Learning is now part of vSphere and you can enable it much easier. Search on my blog for those details
Clay Quinn says
Any word on vSphere7 compatibilty?
William Lam says
There’s no change in behavior 🙂
Gian Luca Ziosi says
With ESXi 6.7 and Essential Plus license (so no VDS) are there any other solutions? Is the previous dvFilter usable? Thanks.